首页> 外文期刊>Mathematical Problems in Engineering: Theory, Methods and Applications >Reinforcement Learning-Based Autonomous Navigation and Obstacle Avoidance for USVs under Partially Observable Conditions
【24h】

Reinforcement Learning-Based Autonomous Navigation and Obstacle Avoidance for USVs under Partially Observable Conditions

机译:基于强化学习的无人机部分可观测条件下自主导航与避障

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Unmanned surface vehicles (USVs) have been widely used in research and exploration, patrol, and defense. Autonomous navigation and obstacle avoidance, as the essential technology of USVs, are the key conditions for successful mission execution. However, fine modeling of conventional algorithms cannot meet the real-time precise behavior control strategy of USVs in complex environments, which poses a great challenge to autonomous control policy. In this paper, a deep reinforcement learning-based UANOA (USVs autonomous navigation and obstacle avoidance) method is proposed. The UANOA achieves the autonomous navigation task of USVs by real-time sensing of partially complex ocean information around and real-time output of rudder angle control commands of USVs. In our work, we employ a double Q-network to achieve end-to-end control from raw sensor input to output of discrete rudder action, and design a set of reward functions that can be adapted to USV navigation and obstacle avoidance. To alleviate the decision bias caused by partial observable of USVs, we use the long short-term memory (LSTM) networks to enhance the ability to remember the ocean environment of USVs. Experiments demonstrate that UANOA ensures a USV arrives at the target points with optimal path planning in complex ocean environments without any collisions occurring, and UANOA outperforms deep Q-network (DQN) and random control policy in convergence speed, sailing distance, rudder angle steering consumption, and other performance measurements.
机译:无人水面车辆(USV)已广泛应用于研究探索、巡逻和防御。自主导航和避障作为无人艇的基本技术,是成功执行任务的关键条件。然而,常规算法的精细建模无法满足USV在复杂环境下的实时精确行为控制策略,对自主控制策略提出了巨大挑战。该文提出一种基于深度强化学习的UANOA(USVs自主导航和避障)方法。UANOA通过实时感知周围部分复杂的海洋信息,实时输出无人艇的舵角控制命令,实现无人艇的自主导航任务。在我们的工作中,我们采用双Q网络来实现从原始传感器输入到离散方向舵动作输出的端到端控制,并设计了一套可以适应USV导航和避障的奖励功能。为了缓解因无人艇部分可观测而引起的决策偏差,我们使用长短期记忆(LSTM)网络来增强对无人艇海洋环境的记忆能力。实验表明,UANOA在复杂海洋环境下,无人艇以最优路径规划到达目标点,不发生任何碰撞,且UANOA在收敛速度、航行距离、舵角转向消耗等性能测量方面均优于深度Q网络(DQN)和随机控制策略。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号