首页> 外文会议>Annual conference on Neural Information Processing Systems >Distributed Exploration in Multi-Armed Bandits
【24h】

Distributed Exploration in Multi-Armed Bandits

机译:多武装匪徒的分布式探索

获取原文
获取外文期刊封面目录资料

摘要

We study exploration in Multi-Armed Bandits in a setting where k players collaborate in order to identify an ε-optimal arm. Our motivation comes from recent employment of bandit algorithms in computationally intensive, large-scale applications. Our results demonstrate a non-trivial tradeoff between the number of arm pulls required by each of the players, and the amount of communication between them. In particular, our main result shows that by allowing the k players to communicate only once, they are able to learn k~(1/2) times faster than a single player. That is, distributing learning to k players gives rise to a factor k~(1/2) parallel speedup. We complement this result with a lower bound showing this is in general the best possible. On the other extreme, we present an algorithm that achieves the ideal factor k speed-up in learning performance, with communication only logarithmic in 1/ε.
机译:我们在k参与者协作以识别ε最佳臂的情况下在多武装匪徒中探索探索。我们的动机来自最近在计算密集型大规模应用中的强盗算法的就业。我们的结果展示了每个玩家所需的ARM拉动数量和它们之间的通信量之间的非琐碎权衡。特别是,我们的主要结果表明,通过允许K玩家只能沟通一次,他们能够比单个玩家更快地学习K〜(1/2)。也就是说,向K玩家分发学习产生了一个因素K〜(1/2)并行加速。我们将此结果与此结果相得益彰,显示这通常是最好的。在另一个极端,我们介绍了一种算法,该算法实现了学习性能的理想因数K加速,只有在1 /ε中的通信。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号