首页> 外文会议>European conference on machine learning and knowledge discovery in databases >Error-Bounded Approximations for Infinite-Horizon Discounted Decentralized POMDPs
【24h】

Error-Bounded Approximations for Infinite-Horizon Discounted Decentralized POMDPs

机译:无限地平线折扣分散POMDP的误差有界近似

获取原文

摘要

We address decentralized stochastic control problems represented as decentralized partially observable Markov decision processes (Dec-POMDPs). This formalism provides a general model for decision-making under uncertainty in cooperative, decentralized settings, but the worst-case complexity makes it difficult to solve optimally (NEXP-complete). Recent advances suggest recasting Dec-POMDPs into continuous-state and deterministic MDPs. In this form, however, states and actions are embedded into high-dimensional spaces, making accurate estimate of states and greedy selection of actions intractable for all but trivial-sized problems. The primary contribution of this paper is the first framework for error-monitoring during approximate estimation of states and selection of actions. Such a framework permits us to convert state-of-the-art exact methods into error-bounded algorithms, which results in a scalability increase as demonstrated by experiments over problems of unprecedented sizes.
机译:我们解决了分散的,局部可观察的马尔可夫决策过程(Dec-POMDPs)所代表的分散随机控制问题。这种形式主义为在合作,分散的环境中不确定性下的决策提供了一个通用模型,但是最坏情况下的复杂性使得难以最优地求解(NEXP-complete)。最近的进展表明,将Dec-POMDP重新铸造为连续状态和确定性MDP。然而,以这种形式,状态和动作被嵌入到高维空间中,从而使得状态的精确估计和对动作的贪婪选择对于除小规模问题之外的所有问题都是难以解决的。本文的主要贡献是在状态的近似估计和动作选择期间进行错误监视的第一个框架。这样的框架使我们能够将最新的精确方法转换为错误错误的算法,这导致可扩展性的提高,这是针对规模空前的问题进行的实验所证明的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号