首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence
【24h】

Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

机译:无限武装匪徒模型的纯粹探索,固定信道

获取原文
           

摘要

We consider the problem of near-optimal arm identification in the fixed confidence setting of the infinitely armed bandit problem when nothing is known about the arm reservoir distribution. We (1)?introduce a PAC-like framework within which to derive and cast results; (2)?derive a sample complexity lower bound for near-optimal arm identification; (3)?propose an algorithm that identifies a nearly-optimal arm with high probability and derive an upper bound on its sample complexity which is within a log factor of our lower bound; and (4)?discuss whether our $log^2 rac{1}{δ}$ dependence is inescapable for “two-phase” (select arms first, identify the best later) algorithms in the infinite setting. This work permits the application of bandit models to a broader class of problems where fewer assumptions hold.
机译:当关于臂储存器分布的任何内容知之甚少时,我们考虑在无限武装强盗问题的固定置信区内近乎最佳臂识别问题。我们(1)?引入类似于衍生和施放结果的PAC样框架; (2)?导出近最佳臂识别的样本复杂性下限; (3)?提出一种算法,该算法识别具有高概率的近乎最佳臂,并导出其样本复杂性的上限,这在我们下限的日志系数内; (4)?讨论我们的$ log ^ 2 FRAC {1} {Δ} $依赖性对于“两阶段”(首先选择武器,确定最佳后来)算法中的无限设置中的算法。这项工作允许将强盗模型应用于更少的假设持有的更广泛的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号