...
首页> 外文期刊>Journal of machine learning research >Best Arm Identification for Contaminated Bandits
【24h】

Best Arm Identification for Contaminated Bandits

机译:污染炸匪最好的武器识别

获取原文
           

摘要

This paper studies active learning in the context of robust statistics. Specifically, we propose a variant of the Best Arm Identification problem for contaminated bandits, where each arm pull has probability epsilon of generating a sample from an arbitrary contamination distribution instead of the true underlying distribution. The goal is to identify the best (or approximately best) true distribution with high probability, with a secondary goal of providing guarantees on the quality of this distribution. The primary challenge of the contaminated bandit setting is that the true distributions are only partially identifiable, even with infinite samples. To address this, we develop tight, non-asymptotic sample complexity bounds for high-probability estimation of the first two robust moments (median and median absolute deviation) from contaminated samples. These concentration inequalities are the main technical contributions of the paper and may be of independent interest. Using these results, we adapt several classical Best Arm Identification algorithms to the contaminated bandit setting and derive sample complexity upper bounds for our problem. Finally, we provide matching information-theoretic lower bounds on the sample complexity (up to a small logarithmic factor).
机译:本文研究了强大的统计背景下的主动学习。具体地,我们提出了污染带的最佳臂识别问题的变型,其中每个臂拉出具有从任意污染分布而不是真正的底层分布产生样品的概率ε。目标是识别具有高概率的最佳(或大约最佳)真正的分布,具有在该分布质量上提供保证的二级目标。污染的强盗设置的主要挑战是,即使有无限样品,真正的分布也仅部分识别。为了解决这一点,我们开发了来自受污染样本的前两个强大的时刻(中位数和中位绝对偏差)的高概率估计的紧密,非渐近样本复杂性界限。这些浓度不平等是本文的主要技术贡献,可能具有独立利益。使用这些结果,我们将若干经典最好的ARM识别算法适应污染的强盗设置,并为我们的问题推出样本复杂性上限。最后,我们在样本复杂度上提供匹配的信息 - 理论下限(最多为小对数因子)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号