Properties of the optimality equation and optimal policies in discrete time Markov decision processes

Qiying Hu; Wuyi Yue

首页> 外文期刊>電子情報通信学会技術研究報告. 回路とシステム. Circuits and Systems >Properties of the optimality equation and optimal policies in discrete time Markov decision processes

【24h】

Properties of the optimality equation and optimal policies in discrete time Markov decision processes

机译：离散时间马尔可夫决策过程中最优方程和最优策略的性质

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper investigates the properties of the optimality equation and optimal policies in discrete time Markov decision processes with expected discounted total rewards under the weak conditions that the model is well defined and the optimality equation is true. The optimal value function is characterized as a solution of the optimality equation and the structure of optimal policies is also given.

机译：本文研究了在模型定义良好且最优方程为真的弱条件下具有预期总折扣折现的离散时间马尔可夫决策过程中最优方程和最优策略的性质。将最优值函数描述为最优性方程的解，并给出了最优策略的结构。

著录项

来源
《電子情報通信学会技術研究報告. 回路とシステム. Circuits and Systems》 |2002年第427期|共6页
作者
Qiying Hu; Wuyi Yue;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类通信;
关键词
Discrete time; Markov decision processes; Optimality equation; Optimal policies; Expected discounted total rewards;

机译：离散时间;马尔可夫决策过程;最优方程;最优策略;预期折现总奖励;

相似文献

外文文献
中文文献
专利

1. Properties of the optimality equation and optimal policies in discrete time Markov decision processes [J] . Qiying Hu, Wuyi Yue 電子情報通信学会技術研究報告. 回路とシステム. Circuits and Systems . 2002,第427期

机译：离散时间马尔可夫决策过程中最优方程和最优策略的性质
2. Properties of the optimality equation and optimal policies in discrete time Markov decision processes [J] . Qiying Hu, Wuyi Yue 電子情報通信学会技術研究報告. コンカレント工学. Concurrent System Technology . 2002,第429期

机译：离散时间马尔可夫决策过程中最优方程的性质和最佳策略
3. Properties of the optimality equation and optimal policies in discrete time Markov decision processes [J] . Qiying Hu, Wuyi Yue 電子情報通信学会技術研究報告. 回路とシステム. Circuits and Systems . 2002,第427期

机译：离散时间马尔可夫决策过程中最优方程的性质和最佳策略
4. Sufficiency of Markov policies for continuous-time Markov decision processes and solutions to Kolmogorov's forward equation for jump Markov processes [C] . Feinberg E.A., Mandava M., Shiryaev A.N. IEEE Annual Conference on Decision and Control . 2013

机译：连续时间马尔可夫决策过程的马尔可夫策略的充分性以及跳跃马尔可夫过程的Kolmogorov正方程的解
5. Performance guarantee of a sub-optimal policy for a discrete Markov decision process and its application to a robotic surveillance problem. [D] . Park, Myoungkuk. 2014

机译：离散马尔可夫决策过程的次优策略的性能保证及其在机器人监视问题中的应用。
6. Optimal Information Collection Policies in a Markov Decision Process Framework [O] . Lauren E. Cipriano, Jeremy D. Goldhaber-Fiebert, Shan Liu, -1

机译：马尔可夫决策过程框架中的最佳信息收集策略
7. Properties of the Optimality Equation and Optimal Policies in Discrete Time Markov Decision Processes and Their Applications [O] . Hu Qiying, Yue Wuyi, Qiying Hu, 2003

机译：离散时间马尔可夫决策过程的最优性方程和最优策略的性质及其应用
8. Comments on the Sensitivity of the Optimal Cost and the Optimal Policy for a Discrete Markov Decision Process. [R] . Sernik, E. L., Marcus, S. I. 1989

机译：评离离散马尔可夫决策过程的最优成本敏感性和最优策略。

Properties of the optimality equation and optimal policies in discrete time Markov decision processes

摘要

著录项

相似文献

相关主题

期刊订阅