首页> 外文期刊>Mathematical Methods of Operations Research >Average optimality inequality for continuous-time Markov decision processes in Polish spaces
【24h】

Average optimality inequality for continuous-time Markov decision processes in Polish spaces

机译:波兰空间中连续时间马尔可夫决策过程的平均最优不等式

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

In this paper, we study the average optimality for continuous-time controlled jump Markov processes in general state and action spaces. The criterion to be minimized is the average expected costs. Both the transition rates and the cost rates are allowed to be unbounded. We propose another set of conditions under which we first establish one average optimality inequality by using the well-known “vanishing discounting factor approach”. Then, when the cost (or reward) rates are nonnegative (or nonpositive), from the average optimality inequality we prove the existence of an average optimal stationary policy in all randomized history dependent policies by using the Dynkin formula and the Tauberian theorem. Finally, when the cost (or reward) rates have neither upper nor lower bounds, we also prove the existence of an average optimal policy in all (deterministic) stationary policies by constructing a “new” cost (or reward) rate.
机译:在本文中,我们研究了一般状态和动作空间中连续时间控制的跳跃马尔可夫过程的平均最优性。最小化的标准是平均预期成本。过渡率和成本率都可以不受限制。我们提出了另一组条件,在该条件下,我们首先使用众所周知的“消失折现因子方法”建立一个平均最优不等式。然后,当成本(或报酬)比率为非负(或非正)时,根据平均最优不等式,我们使用Dynkin公式和Tauberian定理证明了所有随机历史依赖策略中平均最优平稳策略的存在。最后,当成本(或报酬)率没有上限或下限时,我们还通过构造“新的”成本(或报酬)率来证明所有(确定性)固定策略中平均最优策略的存在。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号