A Hidden Markov Restless Multi-armed Bandit Model for Playout Recommendation Systems

机译：播出推荐系统的隐马尔可夫躁动不安多臂土匪模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider a restless multi-armed bandit (RMAB) in which each arm can be in one of two states, say 0 or 1. Playing an arm generates a unit reward with a probability that depends on the state of the arm. The belief about the state of the arm can be calculated using a Bayesian update after every play. This RMAB has been designed for use in recommendation systems where the user's preferences depend on the history of recommendations. In this paper we analyse the RMAB by first studying single armed bandit. We show that it is Whittle-indexable and obtain a closed form expression for the Whittle index. For a RMAB to be useful in practice, we need to be able to learn the parameters of the arms. We present Thompson sampling scheme, that learns the parameters of the arms and also illustrate its performance numerically.

机译：我们考虑一个不安定的多臂匪徒（RMAB），其中每个手臂可以处于两种状态之一，例如0或1。演奏手臂会产生单位奖励，其概率取决于手臂的状态。关于手臂状态的信念可以在每次比赛之后使用贝叶斯更新来计算。此RMAB设计用于推荐系统，其中用户的偏好取决于推荐的历史记录。在本文中，我们首先研究单武装匪徒来分析RMAB。我们证明它是Whittle可索引的，并为Whittle索引获得一个封闭形式的表达式。为了使RMAB在实践中有用，我们需要能够了解机械臂的参数。我们提出了汤普森采样方案，该方案可以学习武器的参数并通过数值说明其性能。

著录项

来源
《International conference on communication systems and networks》|2017年|335-362|共28页
会议地点 Bengaluru(IN)
作者
Rahul Meshram; Aditya Gopalan; D. Manjunath;
展开▼
作者单位

Electrical Engineering Department Indian Institute of Technology Bombay Mumbai 400076 India;

ECE Department Indian Institute of Science Bangalore 560012 India;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Restless multi-armed bandit; Recommendation systems; POMDP; Automated playlist creation systems; Learning;

机译：躁动多臂的匪徒；推荐系统； POMDP；自动化的播放列表创建系统；学习;

相似文献

外文文献
中文文献
专利

1. Correction to "Hidden Markov model multiarm bandits: a methodology for beam scheduling in multitarget tracking" [J] . Krishnamurthy V., Evans R.J. IEEE Transactions on Signal Processing . 2003,第6期

机译：对“隐马尔可夫模型多臂强盗：多目标跟踪中的波束调度方法”的更正
2. Semi-Markov and hidden semi-Markov models of energy systems [J] . Yuriy E. Obzherin E3S Web of Conferences . 2018,第3期

机译：能源系统的半马尔可夫模型和隐藏半马尔可夫模型
3. A New Method for Markovian Adaptation of the Non-Markovian Queueing System Using the Hidden Markov Model [J] . Ilija Tanackov, Olegas Prentkovskis, ?arko Jevti?, Algorithms . 2019,第7期

机译：隐马尔可夫模型的非马尔可夫排队系统的马尔可夫适应新方法
4. A Hidden Markov Restless Multi-armed Bandit Model for Playout Recommendation Systems [C] . Rahul Meshram, Aditya Gopalan, D. Manjunath International Conference on Communication Systems and Networks . 2017

机译：用于播放推荐系统的隐藏马尔可夫不安的多武装匪盗模型
5. Exact Results Regarding the Physics of Complex Systems via Linear Algebra, Hidden Markov Models, and Information Theory. [D] . Riechers, Paul Michael. 2016

机译：关于通过线性代数，隐马尔可夫模型和信息论进行的复杂系统物理的精确结果。
6. Classification of the Adenylation and Acyl-Transferase Activity of NRPS and PKS Systems Using Ensembles of Substrate Specific Hidden Markov Models [O] . Barzan I. Khayatt, Lex Overmars, Roland J. Siezen, -1

机译：NRPS和PKS系统的腺苷酸化和酰基转移酶活性的分类使用特定于基质的隐马尔可夫模型的集合
7. A Hidden Markov Restless Multi-armed Bandit Model for Playout Recommendation Systems [O] . Meshram, Rahul, Gopalan, Aditya, Manjunath, D. 2017

机译：一种用于播出的隐马尔可夫不安多臂强盗模型推荐系统
8. Hidden Markov models for fault detection in dynamic systems [R] . 1995

机译：用于动态系统中故障检测的隐马尔可夫模型

A Hidden Markov Restless Multi-armed Bandit Model for Playout Recommendation Systems

摘要

著录项

相似文献

相关主题

期刊订阅