首页> 外文会议>International Joint Conference on Artificial Intelligence >Context Attentive Bandits: Contextual Bandit with Restricted Context
【24h】

Context Attentive Bandits: Contextual Bandit with Restricted Context

机译:上下文周度匪徒:具有限制背景的上下文匪

获取原文

摘要

We consider a novel formulation of the multi-armed bandit model, which we call the contextual bandit with restricted context, where only a limited number of features can be accessed by the learner at every iteration. This novel formulation is motivated by different online problems arising in clinical trials, recommender systems and attention modeling. Herein, we adapt the standard multi-armed bandit algorithm known as Thompson Sampling to take advantage of our restricted context setting, and propose two novel algorithms, called the Thompson Sampling with Restricted Context (TSRC) and the Windows Thompson Sampling with Restricted Context (WTSRC), for handling stationary and nonstationary environments, respectively. Our empirical results demonstrate advantages of the proposed approaches on several real-life datasets.
机译:我们考虑了一种新颖的多武装强盗模型的制定,我们呼叫具有限制上下文的上下文匪徒,其中仅在每次迭代时都可以访问有限数量的功能。这种新型制剂通过临床试验,推荐系统和注意力建模产生的不同在线问题。在此,我们将称为汤普森采样的标准多武装强盗算法适用于利用我们限制的上下文设置,并提出两个新颖的算法,称为汤普森采样,其中包含受限制的上下文(TSRC)和具有受限上下文的窗口汤普森采样(WTSRC ),分别处理静止和非间抗环境。我们的经验结果表明了拟议的近几种现实生活数据集的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号