Analysis of Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms

Alihan Huyuk; Cem Tekin

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Analysis of Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms

【24h】

Analysis of Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms

机译：概率触发臂的组合式多臂匪的汤普森采样分析

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We analyze the regret of combinatorial Thompson sampling (CTS) for the combinatorial multi-armed bandit with probabilistically triggered arms under the semi-bandit feedback setting. We assume that the learner has access to an exact optimization oracle but does not know the expected base arm outcomes beforehand. When the expected reward function is Lipschitz continuous in the expected base arm outcomes, we derive $O(sum_{i =1}^m log T / (p_i Delta_i))$ regret bound for CTS, where $m$ denotes the number of base arms, $p_i$ denotes the minimum non-zero triggering probability of base arm $i$ and $Delta_i$ denotes the minimum suboptimality gap of base arm $i$. We also compare CTS with combinatorial upper confidence bound (CUCB) via numerical experiments on a cascading bandit problem.

机译：我们分析了在半强反馈设置下，概率触发的武器的组合多臂匪的组合汤普森抽样（CTS）的遗憾。我们假设学习者可以使用精确的优化预言，但事先不知道预期的基础结果。当预期奖励函数在预期基准臂结果中为Lipschitz连续时，我们得出CTS的$ O（ sum_ {i = 1} ^ m log T /（p_i Delta_i））$后悔，其中$ m $表示基本臂的数量，$ p_i $表示基本臂$ i $的最小非零触发概率，而$ Delta_i $表示基本臂$ i $的最小次优差距。我们还通过级联强盗问题的数值实验，将CTS与组合上限置信区间（CUCB）进行了比较。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第2010期|共9页
作者
Alihan Huyuk; Cem Tekin;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Analysis of Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms [J] . Alihan Huyuk, Cem Tekin JMLR: Workshop and Conference Proceedings . 2018,第2009期

机译：概率触发臂的组合式多臂匪的汤普森采样分析
2. Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms [J] . Wei Chen, Yajun Wang, Yang Yuan, Journal of machine learning research . 2016,第50期

机译：组合式多武装匪徒及其对概率触发武器的扩展
3. IEEE 802.15.4.e TSCH-Based Scheduling for Throughput Optimization: A Combinatorial Multi-Armed Bandit Approach [J] . Javan Nastooh Taheri, Sabaei Masoud, Hakami Vesal IEEE sensors journal . 2020,第1期

机译：IEEE 802.15.4.基于TSCH的吞吐量优化调度：组合多武装强盗方法
4. Combinatorial multi-armed bandit problem with probabilistically triggered arms: A case with bounded regret [C] . A. Ömer Saritaç, Cem Tekin IEEE Global Conference on Signal and Information Processing . 2017

机译：概率触发武器的组合式多臂匪问题：一个有限遗憾的案例
5. Essays on sequential analysis: Multi-armed bandit with availability constraints and sequential change detection and identification. [D] . Yamazaki, Kazutoshi. 2009

机译：关于顺序分析的文章：具有可用性约束以及顺序更改检测和识别的多臂匪。
6. An Analysis of the Value of Information When Exploring Stochastic Discrete Multi-Armed Bandits [O] . Isaac J. Sledge, José C. Príncipe 2018

机译：探索随机离散多武装匪徒信息的价值分析
7. Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms [O] . Chen, Wei, Wang, Yajun, Yuan, Yang, 2016

机译：组合多臂武装及其概率扩展触发武器

Analysis of Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms

摘要

著录项

相似文献

相关主题

期刊订阅