Technical Note: A Computationally Efficient Algorithm For Undiscounted Markov Decision Processes With Restricted Observations

Lauren B. Davis; Thom J. Hodgson; Russell E. King; Wenbin Wei

首页> 外文期刊>Naval Research Logistics >Technical Note: A Computationally Efficient Algorithm For Undiscounted Markov Decision Processes With Restricted Observations

【24h】

Technical Note: A Computationally Efficient Algorithm For Undiscounted Markov Decision Processes With Restricted Observations

机译：技术说明：具有有限观测值的无折扣马尔可夫决策过程的一种计算有效算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a computationally efficient procedure to determine control policies for an infinite horizon Markov Decision process with restricted observations. The optimal policy for the system with restricted observations is a function of the observation process and not the unobservable states of the system. Thus, the policy is stationary with respect to the partitioned state space. The algorithm we propose addresses the undiscounted average cost case. The algorithm combines a local search with a modified version of Howard's (Dynamic programming and Markov processes, MIT Press, Cambridge, MA, 1960) policy iteration method. We demonstrate empirically that the algorithm finds the optimal deterministic policy for over 96% of the problem instances generated. For large scale problem instances, we demonstrate that the average cost associated with the local optimal policy is lower than the average cost associated with an integer rounded policy produced by the algorithm of Serin and Kulkarni Math Methods Oper Res 61 (2005) 311-328.

机译：我们提出了一种计算有效的过程来确定具有受限观测的无限地平线马尔可夫决策过程的控制策略。具有受限观察的系统的最佳策略是观察过程的函数，而不是系统的不可观察状态。因此，该策略相对于分区状态空间是固定的。我们提出的算法解决了未折现平均成本的情况。该算法将本地搜索与霍华德（动态规划和马尔可夫过程，麻省理工学院出版社，剑桥，马萨诸塞州，1960）策略迭代方法的修改版本结合在一起。我们凭经验证明，该算法可为超过96％的问题实例找到最佳确定性策略。对于大规模问题实例，我们证明与局部最优策略相关的平均成本低于与Serin和Kulkarni Math Methods Oper Res 61（2005）311-328算法生成的整数舍入策略相关的平均成本。

著录项

来源
《Naval Research Logistics》 |2009年第1期|p.86-92|共7页
作者
Lauren B. Davis; Thom J. Hodgson; Russell E. King; Wenbin Wei;
展开▼
作者单位

Department of Industrial and Systems Engineering, North Carolina A&T State University, Greensboro, North Carolina 27411;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类海军;
关键词
markov decision process; heuristics; optimal control;

机译：马氏决策过程;启发式;最优控制;

相似文献

外文文献
中文文献
专利

1. New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system [J] . Ohno Katsuhisa, Boh Toshitaka, Nakade Koichi, European Journal of Operational Research . 2016,第1期

机译：用于大规模无折扣马尔可夫决策过程的新的近似动态规划算法及其在优化生产和分销系统中的应用
2. Efficient Algorithms for Budget-Constrained Markov Decision Processes [J] . Caramanis C., Dimitrov N.B., Morton D.P. Automatic Control, IEEE Transactions on . 2014,第10期

机译：预算受限的马尔可夫决策过程的高效算法
3. Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes [J] . Christel Baier, Holger Hermanns, Joost-Pieter Katoen, Theoretical computer science . 2005,第1期

机译：统一连续时间马尔可夫决策过程中有界可及性概率的高效计算
4. A Computationally Efficient Algorithm for Undiscounted Markov Decision Processes with Restricted Observation [C] . The 7th World Congress on Intelligent Control and Automation(WCICA'08)(第七届智能控制与自动化世界大会)论文集 . 2008

机译：具有约束观测的无折扣马尔可夫决策过程的一种高效计算算法
5. Quantifying shared information value in a supply chain using decentralized Markov decision processes with restricted observations. [D] . Wei, Wenbin. 2005

机译：使用具有受限观察结果的分散马尔可夫决策过程量化供应链中的共享信息价值。
6. Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes [O] . Taylor Killian, Samuel Daulton, George Konidaris, -1

机译：隐马尔可夫决策过程的鲁棒高效转移学习
7. Computational comparison of value iteration algorithms for discounted Markov decision processes [O] . Thomas L. C., Hartley R., Lavercombe A.C. 1982

机译：折扣马尔可夫决策过程的值迭代算法的计算比较
8. Computation Techniques for Large Scale Undiscounted Markov Decision Processes. [R] . Hodgson, T. J., Koehler, G. J. 1978

机译：大规模无用马尔可夫决策过程的计算技术。

Technical Note: A Computationally Efficient Algorithm For Undiscounted Markov Decision Processes With Restricted Observations

摘要

著录项

相似文献

相关主题

期刊订阅