首页> 美国政府科技报告 >Evolutionary Policy Iteration for Solving Markov Decision Processes

【24h】

Evolutionary Policy Iteration for Solving Markov Decision Processes

机译：求解马尔可夫决策过程的进化策略迭代

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The authors propose a novel algorithm called Evolutionary Policy Iteration (EPI) for solving infinite horizon discounted reward Markov Decision Process (MDP) problems. EPI inherits the spirit of the well-known PI algorithm, but eliminates the need to maximize over the entire action space in the policy improvement step, so it should be most effective for problems with very large action spaces. EPI iteratively generates a 'population' or a set of policies such that the performance of the 'elite policy' for a population is monotonically improved with respect to a defined fitness function. EPI converges with probability one to a population whose elite policy is an optimal policy for a given MDP. EPI is naturally parallelizable, and along this discussion a distributed variant of PI also is studied.

著录项

作者
Chang, H. S. ; Lee, H. ; Fu, M. ; Marcus, S.;
展开▼
作者单位

展开▼
年度 2002
页码 1-13
总页数 13
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Algorithms; Optimization; Policies; Decision making; Markov processes; Mutations; Evolution(General); Test and evaluation; Population(Mathematics); Iterations; Problem solving; Convergence; Parallel orientation; Systems analysis; Random variables;

机译：算法;优化;政策;决策;马尔可夫过程;突变;进化（一般）;测试和评估;人口（数学）;迭代;问题解决;收敛;平行取向;系统分析;随机变量;

相似文献

外文文献
中文文献
专利

1. Evolutionary Policy Iteration for Solving Markov Decision Processes [J] . Hyeong Soo Chang, Hong-Gi Lee, Michael C. Fu, IEEE Transactions on Automatic Control . 2005,第11期

机译：解决马尔可夫决策过程的进化策略迭代
2. Approximate Policy Iteration with a Policy Language Bias: Solving Relational Markov Decision Processes [J] . Fern A., Givan R., Yoon S. The Journal of Artificial Intelligence Research . 2006,第12期

机译：具有策略语言偏差的近似策略迭代：解决关系马尔可夫决策过程
3. Approximate Policy Iteration with a Policy Language Bias: Solving Relational Markov Decision Processes [J] . A. Fern S. Yoon, R. Givan Journal of Automation, Mobile Robotics & Intelligent Systems . 2006,第5期

机译：具有策略语言偏差的近似策略迭代：解决关系马尔可夫决策过程
4. Cosine Policy Iteration for Solving Infinite-Horizon Markov Decision Processes [C] . Juan Frausto-Solis, Elizabeth Santiago, Jaime Mora-Vargas MICAI 2009: Advances in artificial intelligence . 2009

机译：求解无限视野马尔可夫决策过程的余弦策略迭代
5. Acceleration of Iterative Methods for Markov Decision Processes. [D] . Shlakhter, Oleksandr. 2010

机译：马尔可夫决策过程的迭代方法的加速。
6. Evolving Robust Policy Coverage Sets in Multi-Objective Markov Decision Processes Through Intrinsically Motivated Self-Play [O] . Sherif Abdelfattah, Kathryn Kasmarik, Jiankun Hu 2018

机译：通过内在动机的自我博弈在多目标马尔可夫决策过程中发展稳健的政策覆盖范围
7. Evolutionary policy iteration for solving Markov decision processes [O] . Hyeong Soo Chang, Hong-gi Lee, Michael C. Fu, 2005

机译：求解Markov决策过程的进化策略迭代

Evolutionary Policy Iteration for Solving Markov Decision Processes

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅