首页> 外文期刊>Autonomous agents and multi-agent systems >Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs
【24h】

Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

机译:针对POMDP和分散式POMDP优化固定大小的随机控制器

获取原文
获取原文并翻译 | 示例
           

摘要

POMDPs and their decentralized multiagent counterparts, DEC-POMDPs, offer a rich framework for sequential decision making under uncertainty. Their high computational complexity, however, presents an important research challenge. One way to address the intractable memory requirements of current algorithms is based on representing agent policies as finite-state controllers. Using this representation, we propose a new approach that formulates the problem as a nonlinear program, which defines an optimal policy of a desired size for each agent. This new formulation allows a wide range of powerful nonlinear programming algorithms to be used to solve POMDPs and DEC-POMDPs. Although solving the NLP optimally is often intractable, the results we obtain using an off-the-shelf optimization method are competitive with state-of-the-art POMDP algorithms and outperform state-of-the-art DEC-POMDP algorithms. Our approach is easy to implement and it opens up promising research directions for solving POMDPs and DEC-POMDPs using nonlinear programming methods.
机译:POMDP及其分散的多代理程序DEC-POMDP为不确定性下的顺序决策提供了一个丰富的框架。然而,它们的高计算复杂度提出了重要的研究挑战。解决当前算法难以解决的内存需求的一种方法是基于将代理策略表示为有限状态控制器。使用这种表示,我们提出了一种将问题表述为非线性程序的新方法,该程序定义了每个代理所需大小的最佳策略。这种新的公式允许使用各种强大的非线性编程算法来求解POMDP和DEC-POMDP。尽管以最佳方式求解NLP通常很棘手,但我们使用现成的优化方法获得的结果与最新的POMDP算法相比,在性能上优于最新的DEC-POMDP算法。我们的方法易于实施,并且为使用非线性编程方法求解POMDP和DEC-POMDP开辟了有希望的研究方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号