首页> 外文期刊>Engineering Applications of Artificial Intelligence >Collaborative multi-agent reinforcement learning based on a novel coordination tree frame with dynamic partition
【24h】

Collaborative multi-agent reinforcement learning based on a novel coordination tree frame with dynamic partition

机译:基于具有动态分区的新型协调树框架的协同多主体强化学习

获取原文
获取原文并翻译 | 示例
           

摘要

In the research of team Markov games, computing the coordinate team dynamically and determining the joint action policy are the main problems. To deal with the first problem, a dynamic team partitioning method is proposed based on a novel coordinate tree frame. We build a coordinate tree with coordinate agent subset and define two breaching weights to represent the weights of an agent to corporate with the agent subset. Each agent chooses the agent subset with a minimum cost as the coordinate team based on coordinate tree. The Q.-learning based on belief allocation studies multi-agents joint action policy which helps corporative multi-agents joint action policy to converge to the optimum solution. We perform experiments on multiple simulation environments and compare the proposed algorithm with similar ones. Experimental results show that the proposed algorithms are able to dynamically compute the corporative teams and design the optimum joint action policy for corporative teams.
机译:在团队马尔可夫博弈研究中,动态计算坐标团队并确定联合行动策略是主要问题。针对第一个问题,提出了一种基于新颖坐标树框架的动态团队划分方法。我们用座席代理子集构建一个坐标树,并定义两个违规权重以代表座席对拥有座席子集的公司的权重。每个座席根据坐标树选择成本最低的座席子集作为座席团队。基于信念分配的Q学习研究了多主体联合行动策略,该策略有助于企业多主体联合行动策略收敛到最优解。我们在多个仿真环境上进行实验,并将所提出的算法与相似的算法进行比较。实验结果表明,所提出的算法能够动态地计算出企业团队,并为企业团队设计了最优的联合行动策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号