首页> 外文期刊>JMLR: Workshop and Conference Proceedings >More Adaptive Algorithms for Adversarial Bandits
【24h】

More Adaptive Algorithms for Adversarial Bandits

机译:对抗强盗的更多自适应算法

获取原文
           

摘要

We develop a novel and generic algorithm for the adversarial multi-armed bandit problem (or more generally the combinatorial semi-bandit problem). When instantiated differently, our algorithm achieves various new data-dependent regret bounds improving previous work. Examples include: 1) a regret bound depending on the variance of only the best arm; 2) a regret bound depending on the first-order path-length of only the best arm; 3) a regret bound depending on the sum of the first-order path-lengths of all arms as well as an important negative term, which together lead to faster convergence rates for some normal form games with partial feedback; 4) a regret bound that simultaneously implies small regret when the best arm has small loss {it and} logarithmic regret when there exists an arm whose expected loss is always smaller than those of other arms by a fixed gap (e.g. the classic i.i.d. setting). In some cases, such as the last two results, our algorithm is completely parameter-free. The main idea of our algorithm is to apply the optimism and adaptivity techniques to the well-known Online Mirror Descent framework with a special log-barrier regularizer. The challenges are to come up with appropriate optimistic predictions and correction terms in this framework. Some of our results also crucially rely on using a sophisticated increasing learning rate schedule.
机译:我们为对抗性多臂匪问题(或更笼统地说是组合半匪问题)开发了一种新颖的通用算法。当以不同的方式实例化时,我们的算法将获得各种新的依赖数据的后悔界限,从而改善了以前的工作。例子包括:1)遗憾的界限取决于最好的手臂的变化; 2)遗憾的界限取决于最好的手臂的一阶路径长度; 3)后悔的局限取决于所有手臂的一阶路径长度的总和以及一个重要的否定项,它们共同导致某些具有部分反馈的正常形式游戏的收敛速度更快; 4)后悔界限,当最好的手臂损失较小时,同时表示小遗憾{ it和}对数后悔,当存在一个手臂的预期损失始终小于其他手臂的固定间隔(例如经典iid设置)时)。在某些情况下,例如最后两个结果,我们的算法完全没有参数。我们算法的主要思想是将乐观和适应性技术应用于具有特殊对数屏障正则器的著名在线镜像下降框架。挑战在于在此框架中提出适当的乐观预测和更正术语。我们的某些结果还至关重要地依赖于使用复杂的提高学习率的时间表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号