首页> 外文期刊>ACM transactions on autonomous and adaptive systems >Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems
【24h】

Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems

机译:协作式多智能体系统中的多智能体增强社会学习以促进协调

获取原文
获取原文并翻译 | 示例

摘要

Most previous works on coordination in cooperative multiagent systems study the problem of how two (or more) players can coordinate on Pareto-optimal Nash equilibrium(s) through fixed and repeated interactions in the context of cooperative games. However, in practical complex environments, the interactions between agents can be sparse, and each agent's interacting partners may change frequently and randomly. To this end, we investigate the multiagent coordination problems in cooperative environments under a social learning framework. We consider a large population of agents where each agent interacts with another agent randomly chosen from the population in each round. Each agent learns its policy through repeated interactions with the rest of the agents via social learning. It is not clear a priori if all agents can learn a consistent optimal coordination policy in such a situation. We distinguish two different types of learners depending on the amount of information each agent can perceive: individual action learner and joint action learner. The learning performance of both types of learners is evaluated under a number of challenging deterministic and stochastic cooperative games, and the influence of the information sharing degree on the learning performance also is investigated-a key difference from the learning framework involving repeated interactions among fixed agents.
机译:先前有关合作多主体系统中协调的大多数研究都研究了两个(或多个)参与者如何通过合作博弈中的固定和重复交互在帕累托最优纳什均衡上进行协调的问题。但是,在实际的复杂环境中,代理之间的交互可能很少,并且每个代理的交互伙伴可能会频繁且随机地更改。为此,我们在社会学习框架下研究合作环境中的多主体协调问题。我们考虑大量的特工,其中每个特工与每个回合中随机选择的另一个特工相互作用。每个代理都通过社交学习与其他代理进行反复交互来学习其策略。尚不清楚先验是否所有代理都能在这种情况下学习一致的最佳协调策略。我们根据每个代理可以感知的信息量来区分两种不同类型的学习者:个体行动学习者和联合行动学习者。在许多具有挑战性的确定性和随机合作博弈中评估了两种类型学习者的学习绩效,并且还研究了信息共享程度对学习绩效的影响-这是与涉及固定代理人反复互动的学习框架的主要区别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号