Accelerated modified policy iteration algorithms for Markov decision processes

Shlakhter O.; Lee C.-G.

首页> 外文期刊>Mathematical methods of operations research >Accelerated modified policy iteration algorithms for Markov decision processes

【24h】

Accelerated modified policy iteration algorithms for Markov decision processes

机译：马尔可夫决策过程的加速修改策略迭代算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a new approach to accelerate the convergence of the modified policy iteration method for Markov decision processes with the total expected discounted reward. In the new policy iteration an additional operator is applied to the iterate generated by Markov operator, resulting in a bigger improvement in each iteration.

机译：我们提出了一种新方法，以加速总预期折现奖励对Markov决策过程的改进策略迭代方法的收敛。在新的策略迭代中，将一个附加运算符应用于由Markov运算符生成的迭代，从而在每次迭代中都有更大的改进。

著录项

来源
《Mathematical methods of operations research》 |2013年第1期|共16页
作者
Shlakhter O.; Lee C.-G.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类 51.91;
关键词
Accelerated convergence; Markov decision processes; Modified policy iteration; Policy iteration;

机译：加速收敛;马尔可夫决策过程;修改后的策略迭代;策略迭代;

相似文献

外文文献
中文文献
专利

1. Accelerated modified policy iteration algorithms for Markov decision processes [J] . Oleksandr Shlakhter, Chi-Guhn Lee Mathematical Methods of Operations Research . 2013,第1期

机译：马尔可夫决策过程的加速修改策略迭代算法
2. Potential-based online policy iteration algorithms for Markov decision processes [J] . Hai-Tao Fang, Xi-Ren Cao IEEE Transactions on Automatic Control . 2004,第4期

机译：基于电位的Markov决策过程在线策略迭代算法
3. Policy iteration type algorithms for recurrent state Markov decision processes [J] . Stephen D. Patek Computers & operations research . 2004,第14期

机译：循环状态马尔可夫决策过程的策略迭代类型算法
4. A policy iteration algorithm for Markov decision processes skip-free in one direction [C] . J. Lambert, B. Van Houdt, C. Blondia Proceedings of the 2nd international conference on Performance evaluation methodologies and tools . 2007

机译：用于Markov决策过程的策略迭代算法在一个方向上无跳跃
5. Increasing scalability in algorithms for centralized and decentralized partially observable Markov decision processes: Efficient decision-making and coordination in uncertain environments. [D] . Amato, Christopher. 2010

机译：用于集中式和分散式部分可观察的马尔可夫决策过程的算法中的可伸缩性不断增强：在不确定的环境中进行有效的决策和协调。
6. Evolving Robust Policy Coverage Sets in Multi-Objective Markov Decision Processes Through Intrinsically Motivated Self-Play [O] . Sherif Abdelfattah, Kathryn Kasmarik, Jiankun Hu 2018

机译：通过内在动机的自我博弈在多目标马尔可夫决策过程中发展稳健的政策覆盖范围
7. Approximate Policy Iteration for Generalized Semi-Markov Decision Processes: an Improved Algorithm [O] . Rachelson Emmanuel, Fabiani Patrick, Garcia Frédérick 2008

机译：广义半马尔可夫决策过程的近似策略迭代：改进算法

Accelerated modified policy iteration algorithms for Markov decision processes

摘要

著录项

相似文献

相关主题

期刊订阅