...
首页> 外文期刊>IEEE Transactions on Automatic Control >Observation-Based Optimization for POMDPs With Continuous State, Observation, and Action Spaces
【24h】

Observation-Based Optimization for POMDPs With Continuous State, Observation, and Action Spaces

机译:具有连续状态,观察和动作空间的POMDP的基于观察优化

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

This paper considers the optimization problem for partially observable Markov decision processes (POMDPs) with the continuous state, observation, and action spaces. POMDPs with the discrete spaces have emerged as a promising approach to the decision systems with imperfect state information. However, in recent applications of POMDPs, there are many problems that have continuous states, observations, and actions. For such problems, due to the infinite dimensionality of the belief space, the existing studies usually discretize the continuous spaces with the sufficient or nonsufficient statistics, which may cause the curse of dimensionality and performance degradation. In this paper. based on the sensitivity analysis of the performance criteria, we have developed a simulation-based policy iteration algorithm to find the local optimal observation-based policy for POMDPs with the continuous spaces. The proposed algorithm needs none of the specific assumptions and prior information, and has a low computational complexity. One numerical example of the complicated multiple-input multiple-output beamforming problem shows that the algorithm has a significant performance improvement.
机译:本文考虑了与连续状态,观察和动作空间的部分观察到的马尔可夫决策过程(POMDP)的优化问题。带离散空间的POMDPS已成为具有不完美信息的决策系统的有希望的方法。但是,在最近的POMDPS应用中,存在许多存在连续状态,观察和行动的问题。对于这种问题,由于信仰空间的无限维度,现有研究通常具有足够或不足的统计数据来离散连续空间,这可能导致维度和性能下降的诅咒。在本文中。基于对性能标准的敏感性分析,我们开发了一种基于仿真的策略迭代算法,可以找到具有连续空格的POMDP的本地最佳观察策略。所提出的算法不需要特定的假设和先前信息,并且具有低计算复杂性。复杂多输入多输出波束形成问题的一个数值示例表明该算法具有显着的性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号