首页> 外文期刊>Kybernetika >FIRST PASSAGE RISK PROBABILITY OPTIMALITY FOR CONTINUOUS TIME MARKOV DECISION PROCESSES
【24h】

FIRST PASSAGE RISK PROBABILITY OPTIMALITY FOR CONTINUOUS TIME MARKOV DECISION PROCESSES

机译:连续时间马尔可夫决策过程的第一个通道风险概率最优性

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we study continuous time Markov decision processes (CTMDPs) with a denumerable state space, a Borel action space, unbounded transition rates and nonnegative reward function. The optimality criterion to be considered is the first passage risk probability criterion. To ensure the non-explosion of the state processes, we first introduce a so-called drift condition, which is weaker than the well known regular condition for semi-Markov decision processes (SMDPs). Furthermore, under some suitable conditions, by value iteration recursive approximation technique, we establish the optimality equation, obtain the uniqueness of the value function and the existence of optimal policies. Finally, two examples are used to illustrate our results.
机译:在本文中,我们研究了具有可数状态空间,Borel作用空间,无界转移率和非负奖励函数的连续时间马尔可夫决策过程(CTMDP)。要考虑的最佳标准是首次通过风险概率标准。为了确保状态过程不爆炸,我们首先引入所谓的漂移条件,该条件比半马氏决策过程(SMDP)的众所周知的常规条件要弱。此外,在一些合适的条件下,通过值迭代递推逼近技术,建立了最优性方程,得到了价值函数的唯一性和最优策略的存在性。最后,使用两个示例来说明我们的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号