首页> 外文期刊>IEEE Transactions on Automatic Control >First Passage Optimality for Continuous-Time Markov Decision Processes With Varying Discount Factors and History-Dependent Policies
【24h】

First Passage Optimality for Continuous-Time Markov Decision Processes With Varying Discount Factors and History-Dependent Policies

机译:可变折扣因子和历史相关策略的连续时间马尔可夫决策过程的第一遍最优性

获取原文
获取原文并翻译 | 示例
           

摘要

This paper is an attempt to study the first passage optimality criterion for continuous-time Markov decision processes with state-dependent discount factors and history-dependent policies. The state space is denumerable, the action space is a Borel space, and the transition and reward rates are unbounded. Under suitable conditions, we show the existence of a deterministic stationary optimal policy, establish the Bellman (optimality) equation, to which the value function is the unique solution, and give the value and policy iteration algorithms for solving (at least approximating) the value function and an optimal policy. Furthermore, we give examples about reliability and controlled birth processes with killing to illustrate the potential applications of the results obtained here, and also to show the difference between the main results in this paper and those in the previous literature.
机译:本文试图研究具有状态依赖折现因子和历史依赖策略的连续时间马尔可夫决策过程的第一通道最优准则。状态空间是可数的,动作空间是Borel空间,过渡和奖励率是无界的。在合适的条件下,我们证明确定性平稳最优策略的存在,建立Bellman(最优性)方程,以值函数为唯一解,并给出用于求解(至少近似)值的值和策略迭代算法功能和最佳策略。此外,我们提供了有关可靠性和控制性生育过程的实例,其中包括杀死事件,以说明此处获得的结果的潜在应用,并说明本文的主要结果与先前文献中的结果之间的区别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号