首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors
【24h】

Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors

机译:Racing Thompson:使用非共轭先验的Thompson采样的高效算法

获取原文
       

摘要

Thompson sampling has impressive empirical performance for many multi-armed bandit problems. But current algorithms for Thompson sampling only work for the case of conjugate priors since they require to perform online Bayesian posterior inference, which is a difficult task when the prior is not conjugate. In this paper, we propose a novel algorithm for Thompson sampling which only requires to draw samples from a tractable proposal distribution. So our algorithm is efficient even when the prior is non-conjugate. To do this, we reformulate Thompson sampling as an optimization proplem via the Gumbel-Max trick. After that we construct a set of random variables and our goal is to identify the one with highest mean which is an instance of best arm identification problems. Finally, we solve it with techniques in best arm identification. Experiments show that our algorithm works well in practice.
机译:汤普森(Thompson)抽样对于许多多臂匪徒问题具有令人印象深刻的经验表现。但是,当前用于汤普森采样的算法仅适用于共轭先验,因为它们需要执行在线贝叶斯后验推断,这在先验不共轭时是一项艰巨的任务。在本文中,我们提出了一种汤普森抽样的新颖算法,该算法只需要从易于处理的提案分布中抽取样本即可。因此,即使先验是非共轭的,我们的算法也是有效的。为此,我们通过Gumbel-Max技巧将汤普森采样重新设置为优化问题。之后,我们构造了一组随机变量,我们的目标是确定平均值最高的一个,这是最佳手臂识别问题的一个实例。最后,我们使用最佳手臂识别技术解决了该问题。实验表明,该算法在实际中效果良好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号