首页> 外文会议>IEEE International Conference on Parallel and Distributed Systems >Multi-agent Fault-tolerant Reinforcement Learning with Noisy Environments
【24h】

Multi-agent Fault-tolerant Reinforcement Learning with Noisy Environments

机译:用嘈杂环境进行多功能容错强化学习

获取原文
获取外文期刊封面目录资料

摘要

Multi-agent reinforcement learning system is used to solve the problem that agents achieve specific goals in the interaction with the environment through learning policies. Almost all existing multi-agent reinforcement learning methods assume that the observation of the agents is accurate during the training process. It does not take into account that the observation may be wrong due to the complexity of the actual environment or the existence of dishonest agents, which will make the agent training difficult to succeed. In this paper, considering the limitations of the traditional multi-agent algorithm framework in noisy environments, we propose a multi-agent fault-tolerant reinforcement learning (MAFTRL) algorithm. Our main idea is to establish the agent's own error detection mechanism and design the information communication medium between agents. The error detection mechanism is based on the autoencoder, which calculates the credibility of each agent's observation and effectively reduces the environmental noise. The communication medium based on the attention mechanism can significantly improve the ability of agents to extract effective information. Experimental results show that our approach accurately detects the error observation of the agent, which has good performance and strong robustness in both the traditional reliable environment and the noisy environment. Moreover, MAFTRL significantly outperforms the traditional methods in the noisy environment.
机译:多售后强化学习系统用于解决代理通过学习政策在与环境互动中实现特定目标的问题。几乎所有现有的多档强化学习方法都假定在培训过程中观察代理是准确的。它没有考虑到由于实际环境的复杂性或不诚实代理的复杂性,观察可能是错误的,这将使代理人训练难以成功。在本文中,考虑到嘈杂环境中传统的多代理算法框架的局限性,我们提出了一种多功能容错增强学习(MAFTRL)算法。我们的主要思想是建立代理人自己的错误检测机制,并在代理之间设计信息通信介质。错误检测机制基于AutoEncoder,其计算每个代理人观察的可信度,并有效降低环境噪声。基于注意机制的通信介质可以显着提高药剂提取有效信息的能力。实验结果表明,我们的方法准确地检测了传统可靠环境和嘈杂环境中具有良好性能和强大的性能和强大的误差观察。此外,MAFTRL显着优于嘈杂环境中的传统方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号