Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

Eugene A. Feinberg; Jefferson Huang; Bruno Scherrer

首页> 外文期刊>Operations Research Letters: A Journal of the Operations Research Society of America >Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

【24h】

Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

机译：对于折扣动态规划，修改后的策略迭代算法不是强多项式

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This note shows that the number of arithmetic operations required by any member of a broad class of optimistic policy iteration algorithms to solve a deterministic discounted dynamic programming problem with three states and four actions may grow arbitrarily. Therefore any such algorithm is not strongly polynomial. In particular, the modified policy iteration and λ-policy iteration algorithms are not strongly polynomial.

机译：此注释显示，乐观的策略迭代算法的大类中的任何成员解决具有三个状态和四个动作的确定性折扣动态规划问题所需要的算术运算数量可以任意增加。因此，任何这样的算法都不是强多项式。特别地，修改后的策略迭代和λ策略迭代算法不是强多项式。

著录项

来源
《Operations Research Letters: A Journal of the Operations Research Society of America》 |2014年第7期|共3页
作者
Eugene A. Feinberg; Jefferson Huang; Bruno Scherrer;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类运筹学;
关键词
Markov decision process; Modified policy iteration; Strongly polynomial; Policy; Algorithm;

机译：马尔可夫决策过程;修改后的策略迭代;强多项式;策略;算法;

相似文献

外文文献
中文文献
专利

1. Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming [J] . Eugene A. Feinberg, Jefferson Huang, Bruno Scherrer Operations Research Letters: A Journal of the Operations Research Society of America . 2014,第6a7期

机译：对于折扣动态规划，修改后的策略迭代算法不是强多项式
2. The value iteration algorithm is not strongly polynomial for discounted dynamic programming [J] . Eugene A. Feinberg, Jefferson Huang Operations Research Letters: A Journal of the Operations Research Society of America . 2014,第2期

机译：对于折扣动态规划，值迭代算法不是强多项式
3. Obtaining smoother singular arc policies using a modified iterative dynamic programming algorithm [J] . Tholudur A, Ramirez WF International Journal of Control . 1997,第5期

机译：使用改进的迭代动态规划算法获得更平滑的奇异弧策略
4. Q-learning and enhanced policy iteration in discounted dynamic programming [C] . Bertsekas D.P., Huizhen Yu 49th IEEE Conference on Decision and Control . 2010

机译：折扣动态规划中的Q学习和增强的策略迭代
5. A Markovian Optimization Model for Pavement Maintenance Using Policy Iteration Algorithm with Discounted Road-user and Agency Costs [D] . Narh-Dometey, Anita. 2019

机译：利用折扣道路用户和机构成本的策略迭代算法的路面维护马尔瓦维亚优化模型
6. An efficient computer-aided structural elucidation strategy for mixtures using an iterative dynamic programming algorithm [O] . Bo-Han Su, Meng-Yu Shen, Yeu-Chern Harn, 2017

机译：使用迭代动态规划算法的有效的计算机辅助混合物结构解析策略
7. Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming [O] . Eugene A. Feinberg, Jefferson Huang, Bruno Scherrer 2014

机译：修改的政策迭代算法对于折扣动态编程并不强烈的多项式
8. Discounted Dynamic Programming. Part 5. Modified Policy Iteration [R] . Hordijk, A., Kallenberg, L. C. M. 1990

机译：折扣动态规划。第5部分。修改的策略迭代

Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

摘要

著录项

相似文献

相关主题

期刊订阅