...
首页> 外文期刊>Frontiers of computer science >Follow the perturbed approximate leader for solving semi-bandit combinatorial optimization
【24h】

Follow the perturbed approximate leader for solving semi-bandit combinatorial optimization

机译:遵循求解半强盗组合优化的扰动近似领导者

获取原文
获取原文并翻译 | 示例
           

摘要

Combinatorial optimization in the face of uncertainty is a challenge in both operational research and machine learning. In this paper, we consider a special and important class called the adversarial online combinatorial optimization with semi-bandit feedback, in which a player makes combinatorial decisions and gets the corresponding feedback repeatedly. While existing algorithms focus on the regret guarantee or assume there exists an efficient offline oracle, it is still a challenge to solve this problem efficiently if the offline counterpart is NP-hard. In this paper, we propose a variant of the Follow-the-Perturbed-Leader (FPL) algorithm to solve this problem. Unlike the existing FPL approach, our method employs an approximation algorithm as an offline oracle and perturbs the collected data by adding nonnegative random variables. Our approach is simple and computationally efficient. Moreover, it can guarantee a sublinear (1 + ε)-scaled regret of order O(T~(2/3)) for any small ε> 0 for an important class of combinatorial optimization problems that admit an FPTAS (fully polynomial time approximation scheme), in which T is the number of rounds of the learning process. In addition to the theoretical analysis, we also conduct a series of experiments to demonstrate the performance of our algorithm.
机译:面对不确定性的组合优化是运营研究和机器学习中的挑战。在本文中,我们考虑了一种特殊和重要的类,称为对冲在线组合优化的半匪徒反馈,其中玩家使组合决策并重复获得相应的反馈。虽然现有的算法专注于遗憾的保证或假设存在有效的离线Oracle,但如果离线对应物是NP-Hard,则有效地解决此问题仍然是一个挑战。在本文中,我们提出了一种改变的后续扰动 - 领导者(FPL)算法来解决这个问题。与现有的FPL方法不同,我们的方法采用近似算法作为离线Oracle并通过添加非负随机变量来渗透收集的数据。我们的方法简单且计算得高效。此外,它可以保证任何小ε> 0的载位(1 +ε)-scaled遗憾的o(t〜(2/3)),对于承认FPTA的重要组合优化问题(完全多项式时间近似方案),其中T是学习过程的轮次数。除了理论分析之外,我们还进行了一系列实验,以证明我们的算法的性能。

著录项

  • 来源
    《Frontiers of computer science》 |2021年第5期|155404.1-155404.12|共12页
  • 作者单位

    Institute of Computing Technology Chinese Academy of Sciences Bejing 100190 China University of Chinese Academy of Sciences Beijing 100190 China;

    Microsoft Research Asia Beijing 100080 China;

    Institute of Computing Technology Chinese Academy of Sciences Bejing 100190 China University of Chinese Academy of Sciences Beijing 100190 China;

    Institute of Computing Technology Chinese Academy of Sciences Bejing 100190 China University of Chinese Academy of Sciences Beijing 100190 China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    online learning; online combinatorial optimiza-tion; semi-bandit; follow-the-perturbed-leader;

    机译:在线学习;在线组合优化;半匪徒;跟随扰动的领导者;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号