...
首页> 外文期刊>Machine Learning >Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication
【24h】

Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication

机译:通过内存驱动的通信提高小规模多代理深增强学习的协调

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Deep reinforcement learning algorithms have recently been used to train multiple interacting agents in a centralised manner whilst keeping their execution decentralised. When the agents can only acquire partial observations and are faced with tasks requiring coordination and synchronisation skills, inter-agent communication plays an essential role. In this work, we propose a framework for multi-agent training using deep deterministic policy gradients that enables concurrent, end-to-end learning of an explicit communication protocol through a memory device. During training, the agents learn to perform read and write operations enabling them to infer a shared representation of the world. We empirically demonstrate that concurrent learning of the communication device and individual policies can improve inter-agent coordination and performance in small-scale systems. Our experimental results show that the proposed method achieves superior performance in scenarios with up to six agents. We illustrate how different communication patterns can emerge on six different tasks of increasing complexity. Furthermore, we study the effects of corrupting the communication channel, provide a visualisation of the time-varying memory content as the underlying task is being solved and validate the building blocks of the proposed memory device through ablation studies.
机译:深增强学习算法最近被用来以集中式训练多个交互代理,同时保持其执行权限。当代理商只能获取部分观察并面临需要协调和同步技能的任务时,代理商的沟通发挥着重要作用。在这项工作中,我们向多种代理训练提出了一种使用深度确定性策略梯度来提出多种代理培训,该梯度通过存储器设备通过内容,结束通信协议的并发,端到端学习。在培训期间,代理商学会执行读写操作,使他们能够推断世界的共享表示。我们经验证明了通信设备和各个策略的并发学习可以改善小规模系统中的代理商协调和性能。我们的实验结果表明,该方法在具有最多六个代理的情况下实现了卓越的性能。我们说明了如何在增加复杂性的六种不同任务上出现不同的通信模式。此外,我们研究破坏通信信道的效果,提供时变存储器内容的可视化,因为通过消融研究验证并验证所提出的存储器设备的构建块的底层任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号