Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without

Sébastien Bubeck; Yuanzhi Li; Yuval Peres; Mark Sellke

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without

【24h】

Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without

机译：非随机多人多武装匪：碰撞信息的最佳速率，载于载重率

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the non-stochastic version of the (cooperative) multi-player multi-armed bandit problem. The model assumes no communication and no shared randomness at all between the players, and furthermore when two (or more) players select the same action this results in a maximal loss. We prove the first $sqrt{T}$-type regret guarantee for this problem, assuming only two players, and under the feedback model where collisions are announced to the colliding players. We also prove the first sublinear regret guarantee for the feedback model where collision information is not available, namely $T^{1-rac{1}{2m}}$ where $m$ is the number of players.

机译：我们考虑（合作）多武器多武装强盗问题的非随机版本。该模型在玩家之间没有任何通信，并且在玩家之间没有共享随机性，而且当两个（或更多）玩家选择相同的动作时，这导致最大损耗。我们证明了第一个$ sqrt {t} $ - yeione遗憾的保证这个问题，假设只有两个玩家，并且在碰撞碰撞玩家的反馈模型下。我们还证明了第一个Sublinear遗憾保证了碰撞信息不可用的反馈模型，即$ t ^ {1- frac {1} {2m}} $ where $ m $是玩家的数量。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2020年第2010期|共27页
作者
Sébastien Bubeck; Yuanzhi Li; Yuval Peres; Mark Sellke;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Millimeter-Wave Concurrent Beamforming: A Multi-Player Multi-Armed Bandit Approach [J] . Ehab Mahmoud Mohamed, Sherief Hashima, Kohei Hatano, Computers, Materials & Continua . 2020,第3期

机译：毫米波并发波束成形：多播放器多武装匪徒的方法
2. D2D Resource Allocation with Power Control Based on Multi-player Multi-armed Bandit [J] . Kuo Fang-Chang, Schindelhauer Christian, Wang Hwang-Cheng, Wireless personal communications: An Internaional Journal . 2020,第3期

机译：基于多播放器多武装强盗的功率控制D2D资源分配
3. Multi-Player Multi-Armed Bandits for Stable Allocation in Heterogeneous Ad-Hoc Networks [J] . Darak Sumit J., Hanawal Manjesh K. IEEE Journal on Selected Areas in Communications . 2019,第10期

机译：异构Ad-Hoc网络中稳定分配的多层多层武装土匪
4. Research on Optimal Selection Strategy of Search Engine Keywords Based on Multi-armed Bandit [C] . Juan Qin, Wei Qi, Baojian Zhou Hawaii International Conference on System Sciences . 2016

机译：基于多臂匪徒的搜索引擎关键词优化选择策略研究
5. Behavioral models of strategies in multi-armed bandit problems. [D] . Anderson, Christopher Madden. 2001

机译：多武装匪徒问题中策略的行为模型。
6. Gateway Selection in Millimeter Wave UAV Wireless Networks Using Multi-Player Multi-Armed Bandit [O] . Ehab Mahmoud Mohamed, Sherief Hashima, Abdallah Aldosary, 2020

机译：使用多播放器多武装强盗的毫米波无线网络中的网关选择
7. On No-Sensing Adversarial Multi-Player Multi-Armed Bandits With Collision Communications [O] . Chengshuai Shi, Cong Shen 2021

机译：在无传感的对抗性多人多武装匪徒，碰撞通信

Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without

摘要

著录项

相似文献

相关主题

期刊订阅