首页> 外文会议>IEEE International Conference on Data Mining Workshops >Contextual Bandit with Adaptive Feature Extraction
【24h】

Contextual Bandit with Adaptive Feature Extraction

机译:具有自适应特征提取的上下文强盗

获取原文
获取外文期刊封面目录资料

摘要

We consider an online decision making setting known as contextual bandit problem, and propose an approach for improving contextual bandit performance by using an adaptive feature extraction (representation learning) based on online clustering. Our approach starts with an off-line pre-training on unlabeled history of contexts (which can be exploited by our approach, but not by the standard contextual bandit), followed by an online selection and adaptation of encoders. Specifically, given an input sample (context), the proposed approach selects the most appropriate encoding function to extract a feature vector which becomes an input for a contextual bandit, and updates both the bandit and the encoding function based on the context and on the feedback (reward). Our experiments on a variety of datasets, and both in stationary and non-stationary environments of several kinds demonstrate clear advantages of the proposed adaptive representation learning over the standard contextual bandit based on "raw" input contexts.
机译:我们考虑一个称为上下文强盗问题的在线决策环境,并提出一种通过使用基于在线聚类的自适应特征提取(表示学习)来提高上下文强盗性能的方法。我们的方法开始于对未标记的上下文历史进行离线预训练(可以通过我们的方法来利用,但不能通过标准的上下文匪徒利用),然后是在线选择和修改编码器。具体而言,在给定输入样本(上下文)的情况下,建议的方法选择最合适的编码函数来提取特征向量,该特征向量成为上下文匪徒的输入,并根据上下文和反馈更新匪徒和编码函数(报酬)。我们在各种数据集上进行的实验,以及在几种固定和非固定环境下的实验,都证明了与基于“原始”输入上下文的标准上下文强盗相比,所提出的自适应表示学习具有明显的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号