首页> 外文期刊>IEEE Transactions on Automatic Control >Asymptotic Optimality of Finite Model Approximations for Partially Observed Markov Decision Processes With Discounted Cost
【24h】

Asymptotic Optimality of Finite Model Approximations for Partially Observed Markov Decision Processes With Discounted Cost

机译:有限模型近似的渐近最优折扣折扣判决过程的有限模型近似

获取原文
获取原文并翻译 | 示例

摘要

We consider finite model approximations of discrete-time partially observed Markov decision processes (POMDPs) under the discounted cost criterion. After converting the original partially observed stochastic control problem to a fully observed one on the belief space, the finite models are obtained through the uniform quantization of the state and action spaces of the belief space Markov decision process (MDP). Under mild assumptions on the components of the original model, it is established that the policies obtained from these finite models are nearly optimal for the belief space MDP, and so, for the original partially observed problem. The assumptions essentially require that the belief space MDP satisfies a mild weak continuity condition. We provide an example and introduce explicit approximation procedures for the quantization of the set of probability measures on the state space of POMDP (i.e., belief space).
机译:我们考虑在贴现成本标准下离散时部分观察到的马尔可夫决策过程(POMDP)的有限模型近似。在将原始部分观察到的随机控制问题转换为在信仰空间上完全观察到的,通过均匀量化信仰空间马尔可夫决策过程(MDP)的状态和动作空间的均匀量化获得了有限模型。在原始模型的组件上的温和假设下,确定从这些有限型号获得的政策对于信仰空间MDP,因此对于原始部分观察到的问题几乎是最佳的。假设基本上要求信仰空间MDP满足轻度弱连续性条件。我们提供了一个示例并引入了用于量化POMDP(即,信仰空间)的状态空间上的概率测量集的量化的显式近似程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号