首页> 外文期刊>IEEE Transactions on Systems, Man, and Cybernetics >A Continuous-Time Markov Decision Process-Based Method With Application in a Pursuit-Evasion Example
【24h】

A Continuous-Time Markov Decision Process-Based Method With Application in a Pursuit-Evasion Example

机译:基于连续时间马尔可夫决策过程的方法在逃避实例中的应用

获取原文
获取原文并翻译 | 示例
       

摘要

This paper presents a novel method-continuous-time Markov decision process (CTMDP)-to address the uncertainties in pursuit-evasion problem. The primary difference between the CTMDP and the Markov decision process (MDP) is that the former takes into account the influence of the transition time between the states. The policy iteration method-based potential performance for solving the CTMDP and its convergence are also presented. The results obtained by MDP-based method demonstrate that it is a special case of CTMDP-based method involving the identity transition rate matrix. To compare the methods, a well-known pursuit-evasion problem, involving two identical cars, is solved as a benchmark. The CTMDP-based method can provide a discretization solution that is close to the analytical solution obtained by the differential game method. Besides, it shows strong robustness against changes in the transition probability, as compared with the traditional MDP-based method. To the best of our knowledge, this is the first attempt to validate the influence of the transition time between the states in such a pursuit-evasion scenario, or in a similar application, solved by an MDP-related model. The CTMDP-based method offers a new approach to solving the pursuit-evasion problem and can be extended to similar optimization applications.
机译:本文提出了一种新颖的方法-连续时间马尔可夫决策过程(CTMDP),以解决追逃中的不确定性问题。 CTMDP与马尔可夫决策过程(MDP)之间的主要区别在于,前者考虑了状态之间过渡时间的影响。还提出了基于策略迭代方法的潜在性能来解决CTMDP问题及其收敛性。通过基于MDP的方法获得的结果表明,这是涉及身份转移速率矩阵的基于CTMDP的方法的特例。为了比较这些方法,以两个相同的汽车为例,解决了一个著名的追逃问题。基于CTMDP的方法可以提供离散解决方案,该解决方案与通过差分博弈方法获得的分析解决方案非常接近。此外,与传统的基于MDP的方法相比,它对过渡概率的变化表现出强大的鲁棒性。据我们所知,这是在这种追逃方案或类似应用中(通过MDP相关模型解决)验证状态之间过渡时间影响的首次尝试。基于CTMDP的方法提供了一种解决追逃问题的新方法,并且可以扩展到类似的优化应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号