A Hidden Markov Restless Multi-armed Bandit Model for Playout Recommendation Systems

机译：用于播放推荐系统的隐藏马尔可夫不安的多武装匪盗模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider a restless multi-armed bandit (RMAB) in which each arm can be in one of two states, say 0 or 1. Playing an arm generates a unit reward with a probability that depends on the state of the arm. The belief about the state of the arm can be calculated using a Bayesian update after every play. This RMAB has been designed for use in recommendation systems where the user's preferences depend on the history of recommendations. In this paper we analyse the RMAB by first studying single armed bandit. We show that it is Whittle-indexable and obtain a closed form expression for the Whittle index. For a RMAB to be useful in practice, we need to be able to learn the parameters of the arms. We present Thompson sampling scheme, that learns the parameters of the arms and also illustrate its performance numerically.

机译：我们考虑一个焦躁的多武装强盗（RMAB），其中每个臂可以在两个状态中的一个中，说0或1.播放手臂以取决于臂的状态的概率产生单元奖励。关于手臂状态的信念可以在每次游戏后使用贝叶斯更新来计算。此RMAB设计用于推荐系统，其中用户的偏好依赖于建议的历史。在本文中，我们通过首先研究单一武装强盗来分析RMAB。我们表明它是可索引的，并获得薄片指数的闭合形式表达。对于在实践中有用的RMAB，我们需要能够学习武器的参数。我们介绍了汤普森采样方案，用于了解武器的参数，并在数值上说明其性能。

著录项

来源
《International Conference on Communication Systems and Networks》|2017年|362p|共28页
会议地点
作者
Rahul Meshram; Aditya Gopalan; D. Manjunath;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN91-53;
关键词
Restless multi-armed bandit; Recommendation systems; POMDP; Automated playlist creation systems; Learning;

机译：焦躁的多武装匪徒;推荐系统;POMDP;自动播放列表创作系统;学习;

相似文献

外文文献
中文文献
专利

1. Correction to "Hidden Markov model multiarm bandits: a methodology for beam scheduling in multitarget tracking" [J] . Krishnamurthy V., Evans R.J. IEEE Transactions on Signal Processing . 2003,第6期

机译：对“隐马尔可夫模型多臂强盗：多目标跟踪中的波束调度方法”的更正
2. Semi-Markov and hidden semi-Markov models of energy systems [J] . Yuriy E. Obzherin E3S Web of Conferences . 2018,第3期

机译：能源系统的半马尔可夫模型和隐藏半马尔可夫模型
3. A New Method for Markovian Adaptation of the Non-Markovian Queueing System Using the Hidden Markov Model [J] . Ilija Tanackov, Olegas Prentkovskis, ?arko Jevti?, Algorithms . 2019,第7期

机译：隐马尔可夫模型的非马尔可夫排队系统的马尔可夫适应新方法
4. A Hidden Markov Restless Multi-armed Bandit Model for Playout Recommendation Systems [C] . Rahul Meshram, Aditya Gopalan, D. Manjunath International conference on communication systems and networks . 2017

机译：播出推荐系统的隐马尔可夫躁动不安多臂土匪模型
5. Exact Results Regarding the Physics of Complex Systems via Linear Algebra, Hidden Markov Models, and Information Theory. [D] . Riechers, Paul Michael. 2016

机译：关于通过线性代数，隐马尔可夫模型和信息论进行的复杂系统物理的精确结果。
6. Classification of the Adenylation and Acyl-Transferase Activity of NRPS and PKS Systems Using Ensembles of Substrate Specific Hidden Markov Models [O] . Barzan I. Khayatt, Lex Overmars, Roland J. Siezen, -1

机译：NRPS和PKS系统的腺苷酸化和酰基转移酶活性的分类使用特定于基质的隐马尔可夫模型的集合
7. A Hidden Markov Restless Multi-armed Bandit Model for Playout Recommendation Systems [O] . Meshram, Rahul, Gopalan, Aditya, Manjunath, D. 2017

机译：一种用于播出的隐马尔可夫不安多臂强盗模型推荐系统
8. Hidden Markov models for fault detection in dynamic systems [R] . 1995

机译：用于动态系统中故障检测的隐马尔可夫模型

A Hidden Markov Restless Multi-armed Bandit Model for Playout Recommendation Systems

摘要

著录项

相似文献

相关主题

期刊订阅