首页> 中文期刊>解放军理工大学学报(自然科学版) >交互式动态影响图及其精确求解算法

交互式动态影响图及其精确求解算法

     

摘要

To represent the dynamic relationship between agents in multi-agent Markov decision process with partially observable settings shared by other agents, the interactive dynamic influence diagrams (IDIDs) were presented by extending influence diagrams (IDs) over time and structure.I-DIDs are graphical models for sequential decision making in partially observable setting shared by other agents.It may be used to compute the policy of an agent given its belief as the agent acts and observes in the setting.Exact algorithms for solving I-DIDs demand the solutions of possible models of the agents and then update all models at every time step.The space of other models grows exponentially with the number of time steps,increasing the computational complexity.Thus an exact solution of I-DIDs based on minimal sets was presented by reducing the space of other agents' possible models and updating the selected models, thereby the computational complexity was simplified.Finally, model instances were given.The experimental results show the validity of the algorithm.%为了表示部分可观察马尔可夫环境下,多Agent决策中各Agent之间的动态结构关系,对影响图(IDs)在结构和时间上进行扩展,形成一种能够对其他Agent建模的决策模型:交互式动态影响图(I-DIDs).I-DIDs是不确定环境下多Agent进行序贯决策的图模型.该模型的解是在对其Agent行为概率分布的预测下提供给该Agent的最优决策,能更有效地解决多Agent的决策问题.但I-DIDs状态空间太大,Agents候选模型空间随着时间片的增加成指数级增长,使计算变得复杂.因此,提出了一种基于行为等价的最小化模型集合的方法,通过限制模型增长来缓解模型空间不断扩大的趋势,以达到简化计算的目的.在模型实例上的仿真实验结果显示了该算法的有效性.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号