首页> 外文学位 >Improved Heuristic Search Algorithms for Decision-Theoretic Planning
【24h】

Improved Heuristic Search Algorithms for Decision-Theoretic Planning

机译:用于决策理论规划的改进启发式搜索算法

获取原文
获取原文并翻译 | 示例

摘要

A large class of practical planning problems that require reasoning about uncertain outcomes, as well as tradeoffs among competing goals, can be modeled as Markov decision processes (MDPs). This model has been studied for over 60 years, and has many applications that range from stochastic inventory control and supply-chain planning, to probabilistic model checking and robotic control. Standard dynamic programming algorithms solve these problems for the entire state space. A more efficient heuristic search approach focuses computation on solving these problems for the relevant part of the state space only, given a start state, and using heuristics to identify irrelevant parts of the state space that can be safely ignored. This dissertation considers the heuristic search approach to this class of problems, and makes three contributions that advance this approach.;The first contribution is a novel algorithm for solving MDPs that integrates the standard value iteration algorithm with branch-and-bound search. Called branch-and-bound value iteration, the new algorithm has several advantages over existing algorithms. The second contribution is the integration of recently-developed suboptimality bounds in heuristic search algorithm for MDPs, making it possible for iterative algorithms for solving these planning problems to detect convergence to a bounded-suboptimal solution. The third contribution is the evaluation and analysis of some techniques that are widely-used by state-of-the-art planning algorithms, the identification of some weaknesses of these techniques, and the development of a more efficient implementation of one of these techniques---a solved-labeling procedure that speeds converge by leveraging a decomposition of the state-space graph of a planning problem into strongly-connected components. The new algorithms and techniques introduced in this dissertation are experimentally evaluated on a range of widely-used planning benchmarks.
机译:可以要求对不确定结果进行推理以及在相互竞争的目标之间进行权衡的一大类实际计划问题可以建模为马尔可夫决策过程(MDP)。该模型已经研究了60多年,其应用范围广泛,从随机库存控制和供应链计划到概率模型检查和机器人控制。标准的动态编程算法解决了整个状态空间中的这些问题。一种更有效的启发式搜索方法将计算的重点放在给定起始状态的情况下仅针对状态空间的相关部分解决这些问题,并使用启发式方法识别状态空间中无关紧要的部分,这些部分可以安全地忽略。本文考虑了启发式搜索方法来解决这一类问题,并提出了三点改进的方法。第一,是将标准值迭代算法与分支定界搜索相结合的新型求解MDP的算法。称为分支定界值迭代,新算法与现有算法相比具有多个优点。第二个贡献是在MDP的启发式搜索算法中集成了最近开发的次优边界,这使得用于解决这些计划问题的迭代算法可以检测到有界次优解决方案的收敛性。第三个贡献是对最先进的规划算法广泛使用的某些技术进行评估和分析,识别这些技术的某些弱点以及开发其中一种技术的更有效实现方式- -解决标签的程序,通过将计划问题的状态空间图分解为紧密连接的组件来加快收敛速度​​。在一系列广泛使用的规划基准上,对本文介绍的新算法和新技术进行了实验评估。

著录项

  • 作者

    Abdoulahi, Ibrahim.;

  • 作者单位

    Mississippi State University.;

  • 授予单位 Mississippi State University.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2017
  • 页码 119 p.
  • 总页数 119
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号