NONSTOCHASTIC MULTI-ARMED BANDITS WITH GRAPH-STRUCTURED FEEDBACK

Alon Noga; Cesa-Bianchi Nicolo; Gentile Claudio; Mannor Shie; Mansour Yishay; Shamir Ohad

首页> 外文期刊>SIAM Journal on Computing >NONSTOCHASTIC MULTI-ARMED BANDITS WITH GRAPH-STRUCTURED FEEDBACK

【24h】

NONSTOCHASTIC MULTI-ARMED BANDITS WITH GRAPH-STRUCTURED FEEDBACK

机译：具有图形结构反馈的非旋转多武装匪

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce and study a partial-information model of online learning, where a decision maker repeatedly chooses from a finite set of actions, and observes some subset of the associated losses. This naturally models several situations where the losses of different actions are related, and knowing the loss of one action provides information on the loss of other actions. Moreover, it generalizes and interpolates between the well studied full-information setting (where all losses are revealed) and the bandit setting (where only the loss of the action chosen by the player is revealed). We provide several algorithms addressing different variants of our setting, and provide tight regret bounds depending on combinatorial properties of the information feedback structure.

机译：我们介绍并研究在线学习的部分信息模型，其中决策者反复从有限一组动作中选择，并观察相关损失的一些子集。这自然地模拟了不同行动损失相关的几种情况，并且了解一个动作的丢失提供了有关其他行动丢失的信息。此外，它概括并在研究良好的全信息设置（揭示所有损耗）和强盗设置之间（仅揭示播放器选择的动作的丢失）之间的内插。我们提供了解决我们设置不同变体的几种算法，并根据信息反馈结构的组合属性提供紧密的遗憾界限。

著录项

来源
《SIAM Journal on Computing》 |2017年第6期|共42页
作者
Alon Noga; Cesa-Bianchi Nicolo; Gentile Claudio; Mannor Shie; Mansour Yishay; Shamir Ohad;
展开▼
作者单位

Tel Aviv Univ Dept Math IL-6997801 Tel Aviv Israel;

Univ Milan Dipartimento Sci Informaz I-20135 Milan Italy;

Univ Insubria Dept Informat &

Commun I-21100 Varese Italy;

Technion Dept Elect Engn IL-3200003 Haifa Israel;

Google Res Tel Aviv Israel;

Weizmann Inst Sci Comp Sci &

Appl Math IL-7610001 Rehovot Israel;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类应用数学;
关键词
online learning; multi-armed bandits; learning from experts; learning with partial feedback; graph theory;

机译：在线学习;多武装匪徒;从专家学习;使用部分反馈学习;图论;

相似文献

外文文献
中文文献
专利

1. NONSTOCHASTIC MULTI-ARMED BANDITS WITH GRAPH-STRUCTURED FEEDBACK [J] . Alon Noga, Cesa-Bianchi Nicolo, Gentile Claudio, SIAM Journal on Computing . 2017,第6期

机译：具有图形结构反馈的非旋转多武装匪
2. Nonstochastic Bandits with Composite Anonymous Feedback [J] . Nicolò Cesa-Bianchi, Claudio Gentile, Yishay Mansour JMLR: Workshop and Conference Proceedings . 2018,第1期

机译：具有复合匿名反馈的非随机强盗
3. Nonstochastic Bandits with Composite Anonymous Feedback [J] . Nicolò Cesa-Bianchi, Claudio Gentile, Yishay Mansour JMLR: Workshop and Conference Proceedings . 2018,第12期

机译：具有复合匿名反馈的非随机强盗
4. Individual Regret in Cooperative Nonstochastic Multi-Armed Bandits [C] . Yogev Bar-On, Yishay Mansour Conference on Neural Information Processing Systems . 2020

机译：个人遗憾的合作非旋偶性多武装匪徒
5. Offline Evaluation of Multi-Armed Bandit Algorithms Using Bootstrapped Replay on Expanded Data [D] . Dai, Jin. 2021

机译：在扩展数据上使用引导重播的多武装强盗算法的离线评估
6. Smoking and the bandit: A preliminary study of smoker and non-smoker differences in exploratory behavior measured with a multi-armed bandit task [O] . Merideth A. Addicott, John M. Pearson, Jessica Wilson, -1

机译：吸烟和强盗：用多武装强盗任务测量的探索性行为的吸烟者和非吸烟者差异的初步研究
7. Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback [O] . Alon, Noga, Cesa-Bianchi, Nicolò, Gentile, Claudio, 2014

机译：具有图形结构反馈的非随机多武装强盗

NONSTOCHASTIC MULTI-ARMED BANDITS WITH GRAPH-STRUCTURED FEEDBACK

摘要

著录项

相似文献

相关主题

期刊订阅