Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

机译：稳定体验重播深层多智能经纪增强学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.

机译：许多现实世界问题，如网络数据包路由和城市流量控制，自然地被建模为多功能增强学习（RL）问题。然而，现有的多代理RL方法通常在问题大小中规模不佳。因此，关键挑战是将单王子RL的深度学习的成功转化为多功能代理设置。一个主要的绊脚石是独立的Q-Learning，最受欢迎的多代理RL方法，介绍了不符合深度Q-Learning依赖的重播记忆不相容的非运动性。本文提出了解决此问题的两种方法：1）使用重要性采样的多代理变量对自然衰减过时的数据，2）将每个代理的价值函数调节在歧视从重放内存中采样的数据的年龄的指纹上。结果挑战性分散变体的星形争霸单位微管理证实，这些方法使得能够使用多代理RL成功组合体验重放。

著录项

来源
《International Conference on Machine Learning》|2018年|1597-2390p|共10页
会议地点
作者
Jakob Foerster; Nantas Nardelli; Gregory Farquhar; Triantafyllos Afouras; Philip. H. S. Torr; Pushmeet Kohli; Shimon Whiteson;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP181-53;
关键词

相似文献

外文文献
中文文献
专利

1. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning [J] . Jakob Foerster, Nantas Nardelli, Gregory Farquhar, JMLR: Workshop and Conference Proceedings . 2017,第3期

机译：稳定的体验重播，以进行深度的多智能体强化学习
2. Learning multi-agent communication with double attentional deep reinforcement learning [J] . Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Autonomous agents and multi-agent systems . 2020,第1期

机译：学习多智能经纪人沟通与双重预付深度加强学习
3. When Does Communication Learning Need Hierarchical Multi-Agent Deep Reinforcement Learning [J] . Marie Ossenkopf, Mackenzie Jorgensen, Kurt Geihs Cybernetics and Systems . 2019,第5a8期

机译：沟通学习何时需要分层多功能深度加强学习
4. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning [C] . Jakob Foerster, Nantas Nardelli, Gregory Farquhar, International Conference on Machine Learning . 2018

机译：稳定体验重播深层多智能经纪增强学习
5. Entropy-Based Experience Replay in Reinforcement Learning [D] . Dadvar, Mehdi. 2020

机译：基于熵的体验重播在加固学习中
6. Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay [O] . Evan Prianto, MyeongSeop Kim, Jae-Han Park, 2020

机译：使用深度加强学习的多臂操纵器的路径规划：软演员 - 与后敏感体验重播
7. Deep Reinforcement Learning With Quantum-Inspired Experience Replay [O] . Qing Wei, Hailan Ma, Chunlin Chen, 2021

机译：随着量子启发体验重放的深度增强学习
8. Enhanced Experience Replay for Deep Reinforcement Learning. [R] . Doria, D., Dawson, B., Vindiola, M. 2015

机译：增强深度强化学习的体验重播。

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅