Time-Decaying Bandits for Non-stationary Systems

机译：非固定系统的时效土匪

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Contents displayed on web portals (e.g.,) are usually adaptively selected from a dynamic set of candidate items, and the attractiveness of each item decays over time. The goal of those websites is to maximize the engagement of users (usually measured by their clicks) on the selected items. We formulate this kind of applications as a new variant of bandit problems where new arms are dynamically added into the candidate set and the expected reward of each arm decays as the round proceeds. For this new problem, a direct application of the algorithms designed for stochastic MAB (e.g., UCB) will lead to over-estimation of the rewards of old arms, and thus cause a misidentification of the optimal arm. To tackle this challenge, we propose a new algorithm that can adaptively estimate the temporal dynamics in the rewards of the arms, and effectively identify the best arm at a given time point on this basis. When the temporal dynamics are represented by a set of features, the proposed algorithm is able to enjoy a sub-linear regret. Our experiments verify the effectiveness of the proposed algorithm.

机译：通常从动态的候选项目集中自适应地选择显示在门户网站（例如）上的内容，并且每个项目的吸引力随着时间而衰减。这些网站的目标是最大程度地提高用户对所选项目的参与度（通常通过其点击来衡量）。我们将这种应用程序描述为强盗问题的新变体，其中新的组被动态添加到候选集中，并且随着回合的进行，每个组的预期收益都会下降。对于这个新问题，为随机MAB设计的算法（例如UCB）的直接应用将导致对旧武器奖励的过高估计，从而导致对最佳武器的错误识别。为了解决这一挑战，我们提出了一种新算法，该算法可以自适应地估计武器奖励中的时间动态，并在此基础上有效地确定给定时间点上的最佳武器。当时间动态由一组特征表示时，所提出的算法能够享受次线性遗憾。我们的实验验证了所提算法的有效性。

著录项

来源
《International conference on web and internet economics》|2014年|460-466|共7页
会议地点
作者
Junpei Komiyama; Tao Qin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Bandit Convex Optimization in Non-stationary Environments [J] . Peng Zhao, Guanghui Wang, Lijun Zhang, Journal of machine learning research . 2021,第a期

机译：非静止环境中的强盗凸优化
2. Beam Alignment for mmWave Using Non-Stationary Bandits [J] . Ruchir Gupta, K. Lakshmanan, Abhay Kumar Sah Communications Letters, IEEE . 2020,第11期

机译：使用非稳定匪徒的MMWAVE的光束对齐
3. Randomized Exploration for Non-Stationary Stochastic Linear Bandits [J] . Baekjin Kim, Ambuj Tewari JMLR: Workshop and Conference Proceedings . 2020,第2010期

机译：非静止随机线性匪徒随机探索
4. Time-Decaying Bandits for Non-stationary Systems [C] . Junpei Komiyama, Tao Qin International Conference on Web and Internet Economics . 2014

机译：非静止系统的时间腐烂匪徒
5. Investigating the Non-Stationary Bandit Problem [D] . Zografos, Dimitri. 2020

机译：调查非稳定匪徒问题
6. Kinetic Analysis of Batch Ethanol Acetylation in Isothermal Non-Stationary Multiphase Systems by Lyophilized Mycelium of Aspergillus Oryzae [O] . Emilio Palazzi, Francesco Molinari, Bruno Fabiano, 2011

机译：米曲霉冻干菌丝体在等温非平稳多相系统中批量乙醇乙酰化的动力学分析
7. Non-stationary Stochastic Multi-armed Bandit Problems with External Information on Stationarity [O] . Hiroyuki Namba 2021

机译：具有实向性的外部信息的非静止随机多武装强盗问题

Time-Decaying Bandits for Non-stationary Systems

摘要

著录项

相似文献

相关主题

期刊订阅