...
首页> 外文期刊>IEEE Journal on Selected Areas in Communications >Effective Communications: A Joint Learning and Communication Framework for Multi-Agent Reinforcement Learning Over Noisy Channels
【24h】

Effective Communications: A Joint Learning and Communication Framework for Multi-Agent Reinforcement Learning Over Noisy Channels

机译:有效的沟通:用于嘈杂渠道的多智能经纪增强学习的联合学习与通信框架

获取原文
获取原文并翻译 | 示例

摘要

We propose a novel formulation of the "effectiveness problem" in communications, put forth by Shannon and Weaver in their seminal work "The Mathematical Theory of Communication", by considering multiple agents communicating over a noisy channel in order to achieve better coordination and cooperation in a multi-agent reinforcement learning (MARL) framework. Specifically, we consider a multi-agent partially observable Markov decision process (MA-POMDP), in which the agents, in addition to interacting with the environment, can also communicate with each other over a noisy communication channel. The noisy communication channel is considered explicitly as part of the dynamics of the environment, and the message each agent sends is part of the action that the agent can take. As a result, the agents learn not only to collaborate with each other but also to communicate "effectively" over a noisy channel. This framework generalizes both the traditional communication problem, where the main goal is to convey a message reliably over a noisy channel, and the "learning to communicate" framework that has received recent attention in the MARL literature, where the underlying communication channels are assumed to be error-free. We show via examples that the joint policy learned using the proposed framework is superior to that where the communication is considered separately from the underlying MA-POMDP. This is a very powerful framework, which has many real world applications, from autonomous vehicle planning to drone swarm control, and opens up the rich toolbox of deep reinforcement learning for the design of multi-user communication systems.
机译:我们提出了一种新颖的制定,在通信中的“有效性问题”中,由香农和韦弗在他们的开创性工作中提出了“数学通信的数学理论”,通过考虑多个代理商,以实现更好的协调与合作多功能加固学习(Marl)框架。具体地,我们考虑多个代理部分观察到的马尔可夫决策过程(MA-POMDP),其中代理除了与环境相互作用之外,还可以在嘈杂的通信信道上彼此通信。嘈杂的通信频道是明确地认为是环境动态的一部分的,并且每个代理发送的消息都是代理可以采用的操作的一部分。结果,代理商不仅学习彼此合作,而且还要在嘈杂的频道上“有效地”沟通。该框架概括了传统的沟通问题,主要目标是在嘈杂的频道上可靠地传达消息,以及在Marl文献中获得最近关注的“学习传播”框架,其中假设底层通信信道无错误。我们通过示例展示使用所提出的框架学习的联合政策优于沟通与基础MA-POMDP分开考虑的情况。这是一个非常强大的框架,拥有许多真实世界的应用,从自动车辆计划无人驾驶到群体控制,并为多用户通信系统设计开辟了深度加强学习的丰富工具箱。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号