首页> 外文会议>International Conference on Autonomous Agents and Multiagent Systems >Learning in Multi-agent Systems with Sparse Interactions by Knowledge Transfer and Game Abstraction
【24h】

Learning in Multi-agent Systems with Sparse Interactions by Knowledge Transfer and Game Abstraction

机译:通过知识转移和游戏抽象学习具有稀疏交互的多种代理系统

获取原文

摘要

In many multi-agent systems, the interactions between agents are sparse and exploiting interaction sparseness in multi-agent reinforcement learning (MARL) can improve the learning performance. Also, agents may have already learnt some single-agent knowledge (e.g., local value function) before the multi-agent learning process. In this work, we investigate how such knowledge can be utilized to learn better policies in multi-agent systems with sparse interactions. We adopt game theory-based MARL as the basic learning approach since it can coordinate agents better. We contribute three knowledge transfer mechanisms. The first one is value function transfer, which directly transfers agents' local value functions to the learning algorithm. The second one is selective value function transfer, which only transfers the value functions in states where the environmental dynamics change slightly. The last mechanism is model transfer-based game abstraction, which further improves the former two mechanisms by abstracting the one-shot game in each state and reducing equilibrium computation. Experimental results in benchmarks show that with the three knowledge transfer mechanisms, all of the tested game theory-based MARL algorithms are drastically improved and also achieve better asymptotic performance than the state-of-the-art algorithm CQ-learning.
机译:在许多多种代理系统中,代理之间的相互作用是稀疏和利用多功能增强学习中的互动稀疏(Marl)可以提高学习性能。此外,代理商可能已经在多代理学习过程之前学习了一些单一代理知识(例如,本地值函数)。在这项工作中,我们调查如何利用这些知识来学习具有稀疏交互的多智能体系中的更好策略。我们采用基于博弈论的Marl作为基本学习方法,因为它可以更好地协调代理。我们贡献了三种知识转移机制。第一个是价值函数传输,它直接将代理的本地值函数转移到学习算法。第二个是选择性值函数转移,其仅在环境动态略微发生变化的状态下传输价值函数。最后一个机制是基于模型转移的游戏抽象,其通过在每个状态下抽象一拍游戏并减少平衡计算来进一步改善前两种机制。基准的实验结果表明,通过三个知识转移机制,所有基于测试的博弈基础的Marl算法大大提高,并且还具有比最先进的算法CQ-Learning更好的渐近性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号