首页> 外文期刊>IEEE Transactions on Automatic Control >The optimal search for a Markovian target when the search path is constrained: the infinite-horizon case
【24h】

The optimal search for a Markovian target when the search path is constrained: the infinite-horizon case

机译:搜索路径受限时对马尔可夫目标的最佳搜索:无限水平情况

获取原文
获取原文并翻译 | 示例
           

摘要

A target moves among a finite number of cells according to a discrete-time homogeneous Markov chain. The searcher is subject to constraints on the search path, i.e., the cells available for search in the current epoch is a function of the cell searched in the previous epoch. The aim is to identify a search policy that maximizes the infinite-horizon total expected reward earned. We show the following structural results under the assumption that the target transition matrix is ergodic: 1) the optimal search policy is stationary; and 2) there exists /spl epsi/-optimal stationary policies which may be constructed by the standard value iteration algorithm in finite time. These results are obtained by showing that the dynamic programming operator associated with the search problem is an m-stage contraction mapping on a suitably defined space. An upper bound of m and the coefficient of contraction /spl alpha/ is given in terms of the transition matrix and other variables pertaining to the search problem. These bounds on m and /spl alpha/ may be used to derive bounds on suboptimal search polices constructed.
机译:目标根据离散时间齐次马尔可夫链在有限数量的单元格之间移动。搜索者受到搜索路径上的约束,即,当前时代中可用于搜索的像元是前一个时代中搜索的像元的函数。目的是确定一种搜索策略,以最大程度地获得无限水平的总预期奖励。在目标转移矩阵是遍历的假设下,我们给出以下结构结果:1)最优搜索策略是平稳的; 2)存在/ spl epsi /最优平稳策略,可以由标准值迭代算法在有限时间内构造。通过显示与搜索问题关联的动态编程算子是在适当定义的空间上的m阶压缩映射,可以得到这些结果。根据转换矩阵和其他与搜索问题有关的变量,给出m的上限和收缩系数/ spl alpha /。 m和/ spl alpha /上的这些边界可用于导出所构造的次优搜索策略的边界。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号