首页> 外文期刊>Neurocomputing >A composite learning method for multi-ship collision avoidance based on reinforcement learning and inverse control
【24h】

A composite learning method for multi-ship collision avoidance based on reinforcement learning and inverse control

机译:基于钢筋学习和逆控制的多船碰撞避免复合学习方法

获取原文
获取原文并翻译 | 示例

摘要

Model-free reinforcement learning methods have potentials in ship collision avoidance under unknown environments. To defect the low efficiency problem of the model-free reinforcement learning, a composite learning method is proposed based on an asynchronous advantage actor-critic (A3C) algorithm, a long short-term memory neural network (LSTM) and Q-learning. The proposed method uses Q-learning for adaptive decisions between a LSTM inverse model-based controller and the model-free A3C policy. Multi-ship collision avoidance simulations are conducted to verify the effectiveness of the model-free A3C method, the proposed inverse model-based method and the composite learning method. The simulation results indicate that the proposed composite learning based ship collision avoidance method outperforms the A3C learning method and a traditional optimization-based method. (c) 2020 Elsevier B.V. All rights reserved.
机译:无模型强化学习方法在未知环境下具有船舶碰撞避免的潜力。为了缺勤,基于异步优势演员 - 评论仪(A3C)算法,一种长短期内存神经网络(LSTM)和Q学习,提出了一种基于异步优势演员的复合学习方法。该方法使用Q-Learning进行基于LSTM逆模型的控制器和无模型A3C策略之间的自适应决策。进行多船碰撞避免模拟以验证无模型A3C方法的有效性,提出的基于逆模型的方法和复合学习方法。仿真结果表明,所提出的基于复合学习的船舶碰撞避免方法优于A3C学习方法和基于传统优化的方法。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号