首页> 外文期刊>Sequential analysis >Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models
【24h】

Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models

机译:用于重叠假设的非因素顺序测试应用于强盗模型附近的最佳臂识别

获取原文
获取原文并翻译 | 示例

摘要

In this article, we study sequential testing problems with overlapping hypotheses. We first focus on the simple problem of assessing if the mean mu of a Gaussian distribution is smaller or larger than a fixed epsilon 0; if mu is an element of (-epsilon, epsilon), both answers are considered to be correct. Then, we consider probably approximately correct best arm identification in a bandit model: given K probability distributions on R with means mu(1), ... , mu(K), we derive the asymptotic complexity of identifying, with risk at most d, an index I is an element of {1, ..., K} such that mu(I) = max(i mu i) - epsilon: We provide nonasymptotic bounds on the error of a parallel general likelihood ratio test, which can also be used for more general testing problems. We further propose a lower bound on the number of observations needed to identify a correct hypothesis. Those lower bounds rely on information-theoretic arguments, and specifically on two versions of a change of measure lemma (a high-level form and a low-level form) whose relative merits are discussed.
机译:在本文中,我们研究了与重叠假设的顺序测试问题。我们首先专注于评估高斯分布的平均亩的简单问题,如果高斯分布较小或大于固定的epsilon> 0;如果mu是(--epsilon,epsilon)的元素,则两者都被认为是正确的。然后,我们考虑在强盗模型中大致最正确的最佳武器识别:给出r的k概率分布用手段mu(1),......,mu(k),我们导出识别的渐近复杂性,最多是风险的风险,索引i是{1,...,k}的元素,使得mu(i)> = max(i mu i) - epsilon:我们在并行普通似然比测试的误差上提供令人反感的界限,也可用于更一般的测试问题。我们进一步提出了识别正确假设所需的观察数的下限。那些下限依赖于信息理论参数,特别是在讨论其相对优点的测量引理(高级形式和低级形式)的两个版本的两个版本。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号