首页> 外文会议>International Conference on Algorithmic Learning Theory >On Following the Perturbed Leader in the Bandit Setting
【24h】

On Following the Perturbed Leader in the Bandit Setting

机译:遵循强盗设置的扰动领导者

获取原文

摘要

In an online decision problem an algorithm is at each time step required to choose one of the feasible points without knowing the cost associated with it. An adversary assigns the cost to possible decisions either obliviously or adaptively. The online algorithm, naturally, attempts to collect as little cost as possible. The cost difference of the online algorithm and the best static decision in hindsight is called the regret of the algorithm. Kalai and Vempala showed that it is possible to have efficient solutions to some problems with a linear cost function by following the perturbed leader. Their solution requires the costs of all decisions to be known. Recently there has also been some progress in the bandit setting, where only the cost of the selected decision is observed. A bound of O(T~(2/3)) on T rounds was first shown by Awerbuch and Kleinberg for the regret against an oblivious adversary and later McMahan and Blum showed that a bound of O((ln T)~(1/2) T~(3/4)) is obtainable against an adaptive adversary. In this paper we study Kalai and Vempala's model from the viewpoint of bandit algorithms. We show that the algorithm of McMahan and Blum attains a regret of O(T~(2/3)) against an oblivious adversary. Moreover, we show a tighter O((m ln m)~(1/2) T~(1/2)) bound for the expert setting using m experts.
机译:在在线决策问题的算法是选择可行点的一个不知道与它相关的费用每次需要一步。对手无论是东北角或自适应地分配成本可能作出的决定。在线算法,自然,尝试收集尽可能低的成本成为可能。在线算法,并在事后最好的静态决策的成本差异被称为算法的遗憾。卡莱和Vempala表明,可以通过下面的扰动领导人有一些问题与线性成本函数高效的解决方案。他们的解决方案要求所有决策的成本是已知的。最近,人们还一直在土匪设置,其中仅观察到所选择的决策的成本取得了一些进展。阿结合的O(T〜(2/3))对T轮首先由Awerbuch和克莱因伯格示出了用于针对遗憾的忘却对手后来麦默恩和百隆表明,结合O((LN T)〜(1 / 2)T〜(3/4))是针对一个自适应对手获得。在本文中,我们研究从土匪算法的观点卡莱和Vempala的模型。我们发现,麦默恩和百隆的算法达到O(T〜(2/3))针对不经意对手一个遗憾。此外,我们展示了一个紧密的O((M LN米)〜(1/2)T〜(1/2))结合使用米专家专家设置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号