首页> 美国政府科技报告 >Blackwell Optimality in the Class of All Policies in Markov Decision Chains witha Borel State Space and Unbounded Rewards

【24h】

Blackwell Optimality in the Class of All Policies in Markov Decision Chains witha Borel State Space and Unbounded Rewards

机译：具有Borel状态空间和无界奖励的马尔可夫决策链中所有策略类的Blackwell最优性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper is the second part of the authors' study of Blackwell optimal policiesin Markov decision chains with a Borel state space and unbounded rewards. The authors prove that a stationary policy is Blackwell optimal in the class of all history-dependent policies if it is Blackwell optimal in the class of stationary policies. The authors also develop recurrence and drift conditions which ensure ergodicity and intergrability assumptions made in the previous paper, and which are more suitable for applications. As an example the authors study a cash-balance model.

著录项

作者
Hordijk, A.; Yushkevich, A. A.;
展开▼
作者单位

展开▼
年度 2000
页码 1-34
总页数 34
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Optimization; Markov chains; Policies; Borel sets; Topology; Recurrence; Driftconditions; Markov processes; Decision theory; Geometry; Theorems;

机译：优化;马尔可夫链;策略; Borel集;拓扑;递归;漂移条件;马尔可夫过程;决策理论;几何;定理;

相似文献

外文文献
中文文献
专利

1. Blackwell optimality in the class of all policies in Markov decision chains with a Borel state space and unbounded rewards [J] . Arie Hordijk, Alexander A. Yushkevich Mathematical methods of operations research . 1999,第3期

机译：具有Borel状态空间和无穷大奖励的Markov决策链中所有策略类别的Blackwell最优性
2. Blackwell optimality in the class of stationary policies in Morkov decision chains with a Borel state space and unbounded rewards [J] . Arie Hordijk, Alexander A. Yushkevich Mathematical methods of operations research . 1999,第1期

机译：具有Borel状态空间和无穷大奖赏的马尔可夫决策链中文具政策类别的Blackwell最优性
3. Blackwell optimality in the class of markov policies for continuous-time controlled markov chains [J] . Prieto-Rumeau T Acta Applicandae Mathematicae: An International Journal on Applying Mathematics and Mathematical Applications . 2006,第1期

机译：连续时间受控马尔可夫链的马尔可夫策略类中的Blackwell最优性
4. Blackwell optimality in Markov decision processes with a Borel state space [C] . Yushkevich, A.A. . 1997

机译：具有Borel状态空间的Markov决策过程中的Blackwell最优性
5. Performance guarantee of a sub-optimal policy for a discrete Markov decision process and its application to a robotic surveillance problem. [D] . Park, Myoungkuk. 2014

机译：离散马尔可夫决策过程的次优策略的性能保证及其在机器人监视问题中的应用。
6. Optimal Information Collection Policies in a Markov Decision Process Framework [O] . Lauren E. Cipriano, Jeremy D. Goldhaber-Fiebert, Shan Liu, -1

机译：马尔可夫决策过程框架中的最佳信息收集策略
7. Average, sensitive and Blackwell-optimal policies in denumerable Markov decision chains with unbounded rewards [O] . Dekker, R. (Rommert), Hordijk, A. (Arie) 1988

机译：具有无穷回报的可数马尔可夫决策链中的平均，敏感和布莱克韦尔最优策略
8. Contraction Conditions for Average and alpha-Discount Optimality in CountableState Markov Games with Unbounded Rewards [R] . Altman, E., Hordijk, A., Spieksma, F. M. 1994

机译：具有无界奖励的Countablestate markov游戏中平均和alpha折扣最优性的收缩条件

Blackwell Optimality in the Class of All Policies in Markov Decision Chains witha Borel State Space and Unbounded Rewards

摘要

著录项

相似文献

相关主题

期刊订阅