首页> 外文期刊>International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences >TOWARDS CONTINUOUS CONTROL FOR MOBILE ROBOT NAVIGATION: A REINFORCEMENT LEARNING AND SLAM BASED APPROACH
【24h】

TOWARDS CONTINUOUS CONTROL FOR MOBILE ROBOT NAVIGATION: A REINFORCEMENT LEARNING AND SLAM BASED APPROACH

机译:面向移动机器人导航的连续控制:一种基于强化学习和攻击的方法

获取原文
           

摘要

We introduce a new autonomous path planning algorithm for mobile robots for reaching target locations in an unknown environment where the robot relies on its on-board sensors. In particular, we describe the design and evaluation of a deep reinforcement learning motion planner with continuous linear and angular velocities to navigate to a desired target location based on deep deterministic policy gradient (DDPG). Additionally, the algorithm is enhanced by making use of the available knowledge of the environment provided by a grid-based SLAM with Rao-Blackwellized particle filter algorithm in order to shape the reward function in an attempt to improve the convergence rate, escape local optima and reduce the number of collisions with the obstacles. A comparison is made between a reward function shaped based on the map provided by the SLAM algorithm and a reward function when no knowledge of the map is available. Results show that the required learning time has been decreased in terms of number of episodes required to converge, which is 560 episodes compared to 1450 episodes in the standard RL algorithm, after adopting the proposed approach and the number of obstacle collision is reduced as well with a success ratio of 83% compared to 56% in the standard RL algorithm. The results are validated in a simulated experiment on a skid-steering mobile robot.
机译:我们为移动机器人引入了一种新的自主路径规划算法,可以在未知环境中依靠机器人的机载传感器到达目标位置。特别是,我们描述了基于深度确定性策略梯度(DDPG)的具有连续线性和角速度以导航到所需目标位置的深度强化学习运动计划器的设计和评估。此外,通过利用基于网格的SLAM提供的环境和Rao-Blackwellized粒子滤波算法来增强算法,以塑造奖励函数,从而尝试提高收敛速度,避免局部最优和减少与障碍物的碰撞次数。在基于SLAM算法提供的地图而成形的奖励函数与没有地图知识的情况下的奖励函数之间进行比较。结果表明,采用建议的方法后,收敛所需的发作时间减少了,所需的学习时间减少了,与标准RL算法中的1450次发作相比,减少了560次发作,并且减少了碰撞的次数。与标准RL算法中的56%相比,成功率为83%。防滑转向移动机器人的模拟实验验证了结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号