GITTINS INDEX FOR SIMPLE FAMILY OF MARKOV BANDIT PROCESSES WITH SWITCHING COST AND NO DISCOUNTING

Savelov M. P.

首页> 外文期刊>Theory of probability and its applications >GITTINS INDEX FOR SIMPLE FAMILY OF MARKOV BANDIT PROCESSES WITH SWITCHING COST AND NO DISCOUNTING

【24h】

GITTINS INDEX FOR SIMPLE FAMILY OF MARKOV BANDIT PROCESSES WITH SWITCHING COST AND NO DISCOUNTING

机译：Gittins Index用于简单的Markov Birtit流程，具有切换成本，没有折扣

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the multiarmed bandit problem (the problem of Markov bandits) with switching penalties and no discounting in case when state spaces of all bandits are finite. An optimal strategy should have the largest average reward per unit time on an infinite time horizon. For this problem it is shown that an optimal strategy can be specified by a Gittins index under the natural assumption that the switching penalties are nonnegative.

机译：我们考虑复制的强盗问题（Markov Birits的问题），在所有匪徒的状态空间是有限的情况下，如果所有匪徒的状态空间都没有折扣。最佳策略应在无限时间范围内每单位时间具有最大的平均奖励。对于这个问题，结果表明，在自然假设下，可以通过Gittins指数指定最佳策略，即切换惩罚是非负面的。

著录项

来源
《Theory of probability and its applications》 |2019年第3期|共10页
作者
Savelov M. P.;
展开▼
作者单位

Novosibirsk State Univ Novosibirsk Novosibirsk Obl Russia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类概率论与数理统计;
关键词
multicomponent systems; Gittins index; simple family of alternative Markov bandit processes; multiarmed bandit problem; Markov decision process; controlled Markov processes; long run average return; no discounting; switching penalties; optimal strategy;

机译：多组分系统;Gittins索引;简单的替代马尔可夫匪盗进程;多神数据的匪徒问题;马尔可夫决策过程;受控马尔可夫进程;长期运行平均回报;没有折扣;转换罚款;最佳战略;

相似文献

外文文献
中文文献
专利

1. GITTINS INDEX FOR SIMPLE FAMILY OF MARKOV BANDIT PROCESSES WITH SWITCHING COST AND NO DISCOUNTING [J] . Savelov M. P. Theory of probability and its applications . 2019,第3期

机译：Gittins Index用于简单的Markov Birtit流程，具有切换成本，没有折扣
2. Discounted Markov decision processes with fuzzy costs [J] . Semmouri Abdellatif, Jourhmane Mostafa, Belhallaj Zineb Annals of Operations Research . 2020,第2期

机译：折扣马尔可夫决策流程，具有模糊成本
3. TIME-VARYING MARKOV DECISION PROCESSES WITH STATE-ACTION-DEPENDENT DISCOUNT FACTORS AND UNBOUNDED COSTS [J] . Escobedo-Trujillo Beatris A., Higuera-Chan Carmen G. Kybernetika . 2019,第1期

机译：时变马尔可夫决策过程，其状态依赖于折扣因素和成本不受限制
4. Markov Decision Processes with Discounted Cost: The action elimination procedures [C] . Abdellatif SEMMOURI, Mostafa JOURHMANE 2019 International Conference of Computer Science and Renewable Energies . 2019

机译：折扣成本的马尔可夫决策过程：消除行动的程序
5. Switching between simple addition and multiplication: Asymmetrical switch costs due to problem difficulty [D] . Curtis, Evan T. 2012

机译：在简单加法和乘法之间切换：由于问题难度，切换成本不对称
6. Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints [O] . Adam L. Smith, Sofía S. Villar -1

机译：使用Gittins索引的基于贝叶斯自适应强盗的设计用于具有正态分布端点的多臂试验
7. Iteration Algorithms in Markov Decision Processes with State- Action-Dependent Discount Factors and Unbounded Costs [O] . Fernando Luque-Vásquez, J. Adolfo Minjárez-Sosa 2016

机译：Markov决策过程中的迭代算法，具有依赖折扣因子和无限性成本

GITTINS INDEX FOR SIMPLE FAMILY OF MARKOV BANDIT PROCESSES WITH SWITCHING COST AND NO DISCOUNTING

摘要

著录项

相似文献

相关主题

期刊订阅