...
首页> 外文期刊>Ocean Engineering >Deep reinforcement learning based multi-AUVs cooperative decision-making for attack-defense confrontation missions
【24h】

Deep reinforcement learning based multi-AUVs cooperative decision-making for attack-defense confrontation missions

机译:基于深度加强学习的攻击抗结构特派团的多AUVS合作决策

获取原文
获取原文并翻译 | 示例
           

摘要

This paper mainly focuses on using deep reinforcement learning (RL) to deal with the cooperative decision-making problem of multiple Autonomous Underwater Vehicles (multi-AUVs) under limited perception and limited communication in attack-defense confrontation missions. Firstly, a novel Coding-Convolutional Net -work (CCN) is proposed, which can encode the raw sensor information of AUVs and extract representative features from limited perception. Secondly, utilizing the approximator of the state value function and policy function composed of CCN and fully connected network, we put forward a centralized decision-making architecture for multi-AUVs system based on actor-critic framework and give the corresponding algorithm combined with asynchronous training. Moreover, with respect to multi-AUVs attack-defense confrontation tasks, we develop a simulation platform to train and evaluate the proposed algorithms under different parameters. It can be directly observed from the visualization of the simulation that the algorithm enables multi-AUVs to complete the attack-defense confrontation missions excellently and generate some interesting collaborative behaviors. The average winning probability data of the simulation results also indicate that the designed method is feasible for multi-AUVs to achieve cooperative decision-making in attack-defense confrontation missions.
机译:本文主要侧重于利用深度加强学习(RL)在有限的感知和攻击对抗特派团的沟通中有限的感知和有限的沟通,处理多个自治水下车辆(多AUV)的合作决策问题。首先,提出了一种新的编码 - 卷积网络作业(CCN),其可以编码AUV的原始传感器信息,并从有限的感知中提取代表特征。其次,利用由CCN和完全连接的网络组成的状态值函数和策略功能的近似值,我们提出了一种基于演员 - 评论家框架的多AUVS系统的集中决策架构,并为相应的算法结合了异步训练。此外,关于多AUVS攻击抗拒协调任务,我们开发了一个仿真平台,用于在不同参数下培训和评估所提出的算法。可以从模拟的可视化直接观察到算法使多AUV能够精彩地完成攻击抗结构任务,并产生一些有趣的协作行为。仿真结果的平均获胜概率数据还表明,设计的方法对于多AUV来说是可行的,以实现攻击对抗特派团的合作决策。

著录项

  • 来源
    《Ocean Engineering》 |2021年第1期|109794.1-109794.11|共11页
  • 作者单位

    Harbin Engn Univ Coll Intelligent Syst Sci & Engn Harbin 150001 Peoples R China;

    Harbin Engn Univ Coll Intelligent Syst Sci & Engn Harbin 150001 Peoples R China;

    Harbin Engn Univ Coll Intelligent Syst Sci & Engn Harbin 150001 Peoples R China;

    Harbin Engn Univ Coll Intelligent Syst Sci & Engn Harbin 150001 Peoples R China;

    Harbin Engn Univ Coll Intelligent Syst Sci & Engn Harbin 150001 Peoples R China;

    Harbin Engn Univ Coll Intelligent Syst Sci & Engn Harbin 150001 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Multi-AUVs; Deep reinforcement learning; Cooperative decision-making; Attack-defense confrontation;

    机译:多AUV;深度加固学习;合作决策;攻击 - 防御对抗;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号