首页> 外文期刊>IAES International Journal of Robotics and Automation >RSMDP-based Robust Q-learning for Optimal Path Planning in a Dynamic Environment
【24h】

RSMDP-based Robust Q-learning for Optimal Path Planning in a Dynamic Environment

机译:基于RSMDP的鲁棒Q学习,用于动态环境中的最佳路径规划

获取原文
           

摘要

This paper presents arobust Q-learning method for path planningin a dynamic environment. The method consists of three steps: first, a regime-switching Markov decision process (RSMDP) is formed to present the dynamic environment; second a probabilistic roadmap (PRM) is constructed, integrated with the RSMDP and stored as a graph whose nodes correspond to a collision-free world state for the robot; and third, an onlineQ-learning method with dynamic stepsize, which facilitates robust convergence of the Q-value iteration, is integrated with the PRM to determine an optimal path for reaching the goal. In this manner, the robot is able to use past experience for improving its performance in avoiding not only static obstacles but also moving obstacles, without knowing the nature of the obstacle motion. The use ofregime switching in the avoidance of obstacles with unknown motion is particularly innovative.? The developed approach is applied to a homecare robot in computer simulation. The results show that the online path planner with Q-learning is able torapidly and successfully converge to the correct path.
机译:本文提出了一种在动态环境下进行路径规划的可靠的Q学习方法。该方法包括三个步骤:首先,形成状态切换马尔可夫决策过程(RSMDP)来呈现动态环境;其次,构建概率路线图(PRM),与RSMDP集成并存储为图形,其节点对应于机器人的无碰撞世界状态。第三,将具有动态步长的在线Q学习方法与PRM集成在一起,该方法便于Q值迭代的稳健收敛,从而确定实现目标的最佳路径。以这种方式,机器人可以在不了解障碍物运动的本质的情况下,利用过去的经验来改善其性能,从而不仅避免静态障碍物,而且避免了移动障碍物。使用规制切换来避免运动未知的障碍特别创新。所开发的方法应用于计算机仿真中的家庭护理机器人。结果表明,具有Q学习功能的在线路径规划器能够快速成功地收敛到正确的路径。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号