首页> 美国政府科技报告 >Computational Comparison of Value Iteration Algorithms for Discounted Markov Decision Processes.

【24h】

Computational Comparison of Value Iteration Algorithms for Discounted Markov Decision Processes.

机译：马尔可夫决策过程的价值迭代算法计算比较。

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This note describes the results of a computational comparison of value iteration algorithms suggested for solving finite state discounted Markov decision processes. Such a process visits a set of states S = (1,2,...M). In Section two we describe the schemes examined and the various bounds that can be used for stopping them. Section three concentrates on one scheme that did well in the comparison - ordinary value iteration - and looks at various methods for eliminating non-optimal actions both permanently and temporarily.

著录项

作者
Thomas, L. C.; Hartley, R.; Lavercombe, A. C.;
展开▼
作者单位

展开▼
年度 1982
页码 1-14
总页数 14
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Algorithms; Markov processes; Iterations; Decision theory; Computations; Dynamic programming;

机译：算法;马尔可夫过程;迭代;决策理论;计算;动态规划;

相似文献

外文文献
中文文献
专利

1. When are the Value Iteration Maximizers Close to an Optimal Stationary Policy of a Discounted Markov Decision Process? Closing the Gap between the Borel Space Theory and Actual Computations [J] . RAUL MONTES-DE-OCA, ENRIQUE LEMUS-RODRIGUEZ WSEAS Transactions on Mathematics . 2010,第1a3期

机译：价值迭代最大化器何时接近折扣马尔可夫决策过程的最优平稳策略？缩小Borel空间理论与实际计算之间的差距
2. Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces [J] . M. Teresa Robles-Alcaraz, Oscar Vega-Amaya, J. Adolfo Minjarez-Sosa Risk and decision analysis . 2017,第2期

机译：具有受限成本和Borel空间的折扣马尔可夫决策模型的估计与近似策略迭代算法。
3. A Modified Value Iteration Algorithm for Discounted Markov Decision Processes [J] . Sanaa Chafik, Cherki Daoui Journal of Electronic Commerce in Organizations . 2015,第3期

机译：折扣马尔可夫决策过程的改进值迭代算法
4. The complexity of Policy Iteration is exponential for discounted Markov Decision Processes [C] . Hollanders Romain IEEE Conference on Decision and Control;CDC . 2012

机译：对于折现马尔可夫决策过程，策略迭代的复杂性呈指数级增长
5. A Markovian Optimization Model for Pavement Maintenance Using Policy Iteration Algorithm with Discounted Road-user and Agency Costs [D] . Narh-Dometey, Anita. 2019

机译：利用折扣道路用户和机构成本的策略迭代算法的路面维护马尔瓦维亚优化模型
6. Modeling treatment of ischemic heart disease with partially observable Markov decision processes. [O] . M. Hauskrecht, H. Fraser 1998

机译：使用局部可观察的马尔可夫决策过程对缺血性心脏病的治疗进行建模。
7. Computational comparison of value iteration algorithms for discounted Markov decision processes [O] . Thomas L. C., Hartley R., Lavercombe A.C. 1982

机译：折扣马尔可夫决策过程的值迭代算法的计算比较

Computational Comparison of Value Iteration Algorithms for Discounted Markov Decision Processes.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅