【24h】

Network Distributed POMDP with Communication

机译:网络分布式POMDP与通信

获取原文

摘要

While Distributed POMDPs have become popular for modeling multiagent systems in uncertain domains, it is the Network Distributed POMDPs (ND-POMDPs) model that has begun to scale-up the number of agents. The ND-POMDPs can utilize the locality in agents' interactions. However, prior work in ND-POMDPs has failed to address communication. Without communication, the size of a local policy at each agent within the ND-POMDPs grows exponentially in the time horizon. To overcome this problem, we extend existing algorithms so that agents periodically communicate their observation and action histories with each other. After communication, agents can start from new synchronized belief state. Thus, we can avoid the exponential growth in the size of local policies at agents. Furthermore, we introduce an idea that is similar the Point-based Value Iteration algorithm to approximate the value function with a fixed number of representative points. Our experimental results show that we can obtain much longer policies than existing algorithms as long as the interval between communications is small.
机译:虽然分布式POMDPS已经成为在不确定域中建模多层系统的流行,但它是网络分布式POMDPS(ND-POMDPS)模型,该模型已经开始扩大了代理的数量。 ND-POMDP可以利用代理商的相互作用中的局部性。但是,在ND-POMDPS中的事先工作未能解决沟通。在没有通信的情况下,ND-POMDPS内的每个代理的本地政策的大小在时间范围内呈指数增长。为了克服这个问题,我们扩展了现有的算法,使得代理周期性地将其观察和动作历史彼此传达。在沟通之后,代理商可以从新同步信念状态开始。因此,我们可以避免代理人的当地政策规模的指数增长。此外,我们介绍了类似于基于点的值迭代算法,以近似于具有固定数量的代表点的值函数。我们的实验结果表明,只要通信之间的间隔小,我们就可以获得比现有算法更长的政策。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号