Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

Maryam Aziz; Jesse Anderton; Emilie Kaufmann; Javed Aslam

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

【24h】

Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

机译：无限武装匪徒模型的纯粹探索，固定信道

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the problem of near-optimal arm identification in the fixed confidence setting of the infinitely armed bandit problem when nothing is known about the arm reservoir distribution. We (1)?introduce a PAC-like framework within which to derive and cast results; (2)?derive a sample complexity lower bound for near-optimal arm identification; (3)?propose an algorithm that identifies a nearly-optimal arm with high probability and derive an upper bound on its sample complexity which is within a log factor of our lower bound; and (4)?discuss whether our $log^2 rac{1}{δ}$ dependence is inescapable for “two-phase” (select arms first, identify the best later) algorithms in the infinite setting. This work permits the application of bandit models to a broader class of problems where fewer assumptions hold.

机译：当关于臂储存器分布的任何内容知之甚少时，我们考虑在无限武装强盗问题的固定置信区内近乎最佳臂识别问题。我们（1）？引入类似于衍生和施放结果的PAC样框架; （2）？导出近最佳臂识别的样本复杂性下限; （3）？提出一种算法，该算法识别具有高概率的近乎最佳臂，并导出其样本复杂性的上限，这在我们下限的日志系数内; （4）？讨论我们的$ log ^ 2 FRAC {1} {Δ} $依赖性对于“两阶段”（首先选择武器，确定最佳后来）算法中的无限设置中的算法。这项工作允许将强盗模型应用于更少的假设持有的更广泛的问题。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第2010期|共22页
作者
Maryam Aziz; Jesse Anderton; Emilie Kaufmann; Javed Aslam;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A fully adaptive algorithm for pure exploration in linear bandits [J] . Liyuan Xu, Junya Honda, Masashi Sugiyama JMLR: Workshop and Conference Proceedings . 2018,第2009期

机译：线性土匪纯勘探的完全自适应算法。
2. Pure exploration in finitely-armed and continuous-armed bandits [J] . Sebastien Bubeck, Remi Munos, Gilles Stoltz Theoretical computer science . 2011,第19期

机译：有限武装和连续武装土匪的纯探索
3. Tug-of-war model for the two-bandit problem: Nonlocally-correlated parallel exploration via resource conservation [J] . Kim S.-J., Aono M., Hara M. BioSystems . 2010,第1期

机译：两强问题的拔河模型：通过资源节约进行非本地关联的并行勘探
4. Combinatorial Pure Exploration for Dueling Bandits [C] . Wei Chen, Yihan Du, Longbo Huang, International Conference on Machine Learning . 2021

机译：决斗匪徒组合纯探索
5. Adaptive Preference Learning with Bandit Feedback: Information Filtering, Dueling Bandits and Incentivizing Exploration [D] . Chen, Bangrui. 2017

机译：带有土匪反馈的自适应偏好学习：信息过滤，决斗土匪和激励探索
6. Anytime Exploration for Multi-armed Bandits using ConfidenceInformation [O] . Kwang-Sung Jun, Robert Nowak -1

机译：随时随地探索多臂匪信息
7. Improved Learning Complexity in Combinatorial Pure Exploration Bandits [O] . Gabillon Victor, Lazaric Alessandro, Ghavamzadeh Mohammad, 2016

机译：提高组合纯探险匪徒的学习复杂性

Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

摘要

著录项

相似文献

相关主题

期刊订阅