Distributed Exploration in Multi-Armed Bandits

机译：多武装匪徒的分布式探索

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We study exploration in Multi-Armed Bandits in a setting where k players collaborate in order to identify an ε-optimal arm. Our motivation comes from recent employment of bandit algorithms in computationally intensive, large-scale applications. Our results demonstrate a non-trivial tradeoff between the number of arm pulls required by each of the players, and the amount of communication between them. In particular, our main result shows that by allowing the k players to communicate only once, they are able to learn k~(1/2) times faster than a single player. That is, distributing learning to k players gives rise to a factor k~(1/2) parallel speedup. We complement this result with a lower bound showing this is in general the best possible. On the other extreme, we present an algorithm that achieves the ideal factor k speed-up in learning performance, with communication only logarithmic in 1/ε.

机译：我们在k参与者协作以识别ε最佳臂的情况下在多武装匪徒中探索探索。我们的动机来自最近在计算密集型大规模应用中的强盗算法的就业。我们的结果展示了每个玩家所需的ARM拉动数量和它们之间的通信量之间的非琐碎权衡。特别是，我们的主要结果表明，通过允许K玩家只能沟通一次，他们能够比单个玩家更快地学习K〜（1/2）。也就是说，向K玩家分发学习产生了一个因素K〜（1/2）并行加速。我们将此结果与此结果相得益彰，显示这通常是最好的。在另一个极端，我们介绍了一种算法，该算法实现了学习性能的理想因数K加速，只有在1 /ε中的通信。

著录项

来源
《Annual conference on Neural Information Processing Systems》|2013年||共9页
会议地点
作者
Eshcar Hillel; Zohar Karnin; Tomer Koren; Ronny Lempel; Oren Somekh;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Tsallis-INF for Decoupled Exploration and Exploitation in Multi-armed Bandits [J] . Chloé Rouyer, Yevgeny Seldin JMLR: Workshop and Conference Proceedings . 2020,第2010期

机译：Tsallis-inf用于多武装匪徒的解耦探索和剥削
2. Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems [J] . Vakili, S., Liu, Selected Topics in Signal Processing, IEEE Journal of . 2013,第5期

机译：多臂强盗问题的勘探与开发确定性排序
3. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits [J] . Jean-Yves Audibert, Remi Munos, Csaba Szepesvari Theoretical computer science . 2009,第19期

机译：在多武装土匪中使用方差估计进行勘探与开发权衡
4. Distributed Exploration in Multi-Armed Bandits [C] . Eshcar Hillel, Zohar Karnin, Tomer Koren, Annual conference on Neural Information Processing Systems . 2013

机译：多武装土匪的分布式探索
5. Distributed Multi-Agent Multi-Armed Bandits [D] . Landgren, Peter Chal 2019

机译：分布式多代理多武装土匪
6. Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints [O] . Adam L. Smith, Sofía S. Villar -1

机译：使用Gittins索引的基于贝叶斯自适应强盗的设计用于具有正态分布端点的多臂试验
7. Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-armed Bandits [O] . Chao Tao, Qin Zhang, Yuan Zhou 2019

机译：具有有限互动的协作学习：多武装匪徒分布式勘探的紧张界

Distributed Exploration in Multi-Armed Bandits

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅