Average optimality inequality for continuous-time Markov decision processes in Polish spaces

Quanxin Zhu

首页> 外文期刊>Mathematical Methods of Operations Research >Average optimality inequality for continuous-time Markov decision processes in Polish spaces

【24h】

Average optimality inequality for continuous-time Markov decision processes in Polish spaces

机译：波兰空间中连续时间马尔可夫决策过程的平均最优不等式

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we study the average optimality for continuous-time controlled jump Markov processes in general state and action spaces. The criterion to be minimized is the average expected costs. Both the transition rates and the cost rates are allowed to be unbounded. We propose another set of conditions under which we first establish one average optimality inequality by using the well-known “vanishing discounting factor approach”. Then, when the cost (or reward) rates are nonnegative (or nonpositive), from the average optimality inequality we prove the existence of an average optimal stationary policy in all randomized history dependent policies by using the Dynkin formula and the Tauberian theorem. Finally, when the cost (or reward) rates have neither upper nor lower bounds, we also prove the existence of an average optimal policy in all (deterministic) stationary policies by constructing a “new” cost (or reward) rate.

机译：在本文中，我们研究了一般状态和动作空间中连续时间控制的跳跃马尔可夫过程的平均最优性。最小化的标准是平均预期成本。过渡率和成本率都可以不受限制。我们提出了另一组条件，在该条件下，我们首先使用众所周知的“消失折现因子方法”建立一个平均最优不等式。然后，当成本（或报酬）比率为非负（或非正）时，根据平均最优不等式，我们使用Dynkin公式和Tauberian定理证明了所有随机历史依赖策略中平均最优平稳策略的存在。最后，当成本（或报酬）率没有上限或下限时，我们还通过构造“新的”成本（或报酬）率来证明所有（确定性）固定策略中平均最优策略的存在。

著录项

来源
《Mathematical Methods of Operations Research》 |2007年第2期|299-313|共15页
作者
Quanxin Zhu;
展开▼
作者单位

Department of Mathematics South China Normal University Guangzhou 510631 People’s Republic of China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Continuous-time Markov decision process; Average optimality inequality; General state space; Unbounded cost; Optimal stationary policy; 90C40; 93E20;

机译：连续时间马尔可夫决策过程;平均最优不等式;一般状态空间;无穷成本;最优平稳策略;90C40;93E20;

相似文献

外文文献
中文文献
专利

1. Average optimality for continuous-time Markov decision processes in Polish spaces [J] . Guo XP, Rieder U The Annals of applied probability: an official journal of the Institute of Mathematical Statistics . 2006,第2期

机译：波兰空间中连续时间马尔可夫决策过程的平均最优性
2. Bias and overtaking optimality for continuous-time jump Markov decision processes in polish spaces [J] . Zhu QX, Prieto-Rumeau T Journal of Applied Probability . 2008,第2期

机译：波兰空间中连续时间跳跃马尔可夫决策过程的偏差和超车最优
3. Policy Iteration for Continuous-Time Average Reward Markov Decision Processes in Polish Spaces [J] . QuanxinZhu, XinsongYang, ChuangxiaHuang Abstract and applied analysis . 2009,第6期

机译：波兰空间中连续时间平均奖励马尔可夫决策过程的策略迭代
4. Discounted Optimality for Continuous-Time Markov Decision Processes in Polish Spaces [C] . Xianping Guo . 2006

机译：波兰空间中连续时间马尔可夫决策过程的折扣最优性
5. Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design [D] . Mason, Jennifer Elizabeth 2012

机译：马尔可夫决策过程和近似动态规划方法进行最优处理设计
6. Using model-based proposals for fast parameter inference on discrete state space continuous-time Markov processes [O] . C. M. Pooley, S. C. Bishop, G. Marion 2015

机译：使用基于模型的建议对离散状态空间连续时间马尔可夫过程进行快速参数推断
7. Average optimality for continuous-time Markov decision processes in polish spaces [O] . Guo, Xianping, Rieder, Ulrich 2006

机译：连续时间马尔可夫决策过程的平均最优性抛光空间

Average optimality inequality for continuous-time Markov decision processes in Polish spaces

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅