Approximating Ergodic Average Reward Continuous-Time Controlled Markov Chains

Prieto-Rumeau T.; Lorenzo J. M.

首页> 外文期刊>Automatic Control, IEEE Transactions on >Approximating Ergodic Average Reward Continuous-Time Controlled Markov Chains

【24h】

Approximating Ergodic Average Reward Continuous-Time Controlled Markov Chains

机译：近似遍历平均奖励连续时间控制的马尔可夫链

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study the approximation of an ergodic average reward continuous-time denumerable state Markov decision process (MDP) by means of a sequence of MDPs. Our results include the convergence of the corresponding optimal policies and the optimal gains. For a controlled upwardly skip-free process, we show some computational results to illustrate the convergence theorems.

机译：我们通过一系列MDP来研究遍历平均奖励连续时间可数状态马尔可夫决策过程（MDP）的近似值。我们的结果包括相应最优策略和最优收益的收敛。对于受控向上跳跃过程，我们显示了一些计算结果来说明收敛定理。

著录项

来源
《Automatic Control, IEEE Transactions on》 |2010年第1期|P.201-207|共7页
作者
Prieto-Rumeau T.; Lorenzo J. M.;
展开▼
作者单位

Department of Statistics and Operations Research, UNED, Madrid, Spain;

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
Approximation of control problems; Ergodic Markov decision processes (MDPs); policy iteration algorithm;

机译：控制问题的逼近;遍历马尔可夫决策过程（MDP）;策略迭代算法;

相似文献

外文文献
中文文献
专利

1. Optimal control of ergodic continuous-time Markov chains with average sample-path rewards [J] . Guo XP, Cao XR SIAM Journal on Control and Optimization . 2005,第1期

机译：具有平均样本路径奖励的遍历连续时间马尔可夫链的最优控制
2. Uniform ergodicity of continuous-time controlled Markov chains: A survey and new results [J] . Prieto-Rumeau Tomas, Hernandez-Lerma Onesimo Annals of Operations Research . 2016,第1a2期

机译：连续时间受控马尔可夫链的统一遍历性：一项调查和新结果
3. Continuous-Time Controlled Markov Chains with Discounted Rewards [J] . Xianping Guo, Onesimo Hernandez-Lerma Acta Applicandae Mathematicae: An International Journal on Applying Mathematics and Mathematical Applications . 2003,第3期

机译：具有折扣奖励的连续时间控制马尔可夫链
4. Ergodic Control of Continuous-Time Markov Chains with Pathwise Constraints [C] . Tomas Prieto-Rumeau, Onesimo Hernandez-Lerma IEEE Conference on Decision and Control . 2009

机译：具有PathWise约束的连续时间马尔可夫链的ergodic控制
5. Controlled Markov chains with risk-sensitive average cost criterion. [D] . Brau Rojas, Agustin. 1999

机译：具有风险敏感平均成本准则的受控马尔可夫链。
6. Ergodic Theory of Markov Chains Admitting an Infinite Invariant Measure [O] . T. E. Harris, Herbert Robbins 1953

机译：允许无限不变测度的马尔可夫链的遍历理论
7. Subgeometric ergodicity for continuous-time Markov chains [O] . Liu Yuanyuan, Zhang Hanjun, Zhao Yiqiang 2010

机译：连续时间马尔可夫链的亚几何遍历性

Approximating Ergodic Average Reward Continuous-Time Controlled Markov Chains

摘要

著录项

相似文献

相关主题

期刊订阅