首页> 外文期刊>Kybernetika >FIRST PASSAGE RISK PROBABILITY OPTIMALITY FOR CONTINUOUS TIME MARKOV DECISION PROCESSES
【24h】

FIRST PASSAGE RISK PROBABILITY OPTIMALITY FOR CONTINUOUS TIME MARKOV DECISION PROCESSES

机译:第一次通行风险概率概率最优,用于连续时间马尔可夫决策过程

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we study continuous time Markov decision processes (CTMDPs) with a denumerable state space, a Borel action space, unbounded transition rates and nonnegative reward function. The optimality criterion to be considered is the first passage risk probability criterion. To ensure the non-explosion of the state processes, we first introduce a so-called drift condition, which is weaker than the well known regular condition for semi-Markov decision processes (SMDPs). Furthermore, under some suitable conditions, by value iteration recursive approximation technique, we establish the optimality equation, obtain the uniqueness of the value function and the existence of optimal policies. Finally, two examples are used to illustrate our results.
机译:在本文中,我们研究了连续时间马尔可夫决策过程(CTMDPS),具有可降价的状态空间,Borel Action Space,无绑定的过渡率和非负奖励功能。要考虑的最优标准是第一段段风险概率标准。为了确保国家流程的非爆炸,我们首先引入所谓的漂移条件,这比半马尔可夫决策过程(SMDPS)的众所周知的常规条件弱。此外,在一些合适的条件下,通过价值迭代递归近似技术,我们建立了最优性方程,获得了价值函数的唯一性和最佳策略的存在。最后,使用两个例子来说明我们的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号