首页> 外文会议>IFAC World Congress >A Continuous-time Markov Decision Process Based Method on Pursuit-Evasion Problem
【24h】

A Continuous-time Markov Decision Process Based Method on Pursuit-Evasion Problem

机译:基于追求逃避问题的连续时间马尔可夫决策过程

获取原文

摘要

This paper presents a method to address the pursuit-evasion problem which incorporates the behaviors of the opponent, in which a continuous-time Markov decision process (CTMDP) model is introduced, where the significant difference from Markov decision process (MDP) is that the influence of the transition time between the states is taken into account. By introducing the concept of situation, the probabilities addressing average behaviors are obtained. Furthermore, these probabilities are introduced to construct the transition matrix in the CTMDP. A policy iteration method for solving the CTMDP is also given. To demonstrate the CTMDP method for pursuit-evasion, examples in a grid environment are computed. The CTMDP-based method presented in this paper offers a new approach to pursuit-evasion modeling and may be extended to similar problems in the sequential decision process.
机译:本文提出了一种解决追求逃避问题的方法,该方法包括对手的行为,其中引入了连续时间马尔可夫决策过程(CTMDP)模型,其中来自马尔可夫决策过程(MDP)的显着差异是考虑到各州之间的过渡时间的影响。通过引入情况的概念,获得了寻址平均行为的概率。此外,引入了这些概率来构建CTMDP中的转换矩阵。还给出了解决CTMDP的策略迭代方法。为了展示追求逃守的CTMDP方法,计算网格环境中的示例。本文提出的基于CTMDP的方法提供了一种追求逃避模型的新方法,并且可以扩展到序贯决策过程中的类似问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号