首页> 外文会议>International Conference on Automated Planning and Scheduling >Online Algorithms for POMDPs with Continuous State, Action, and Observation Spaces
【24h】

Online Algorithms for POMDPs with Continuous State, Action, and Observation Spaces

机译:具有连续状态,动作和观察空间的POMDP的在线算法

获取原文

摘要

Online solvers for partially observable Markov decision processes have been applied to problems with large discrete state spaces, but continuous state, action, and observation spaces remain a challenge. This paper begins by investigating double progressive widening (DPW) as a solution to this challenge. However, we prove that this modification alone is not sufficient because the belief representations in the search tree collapse to a single particle causing the algorithm to converge to a policy that is suboptimal regardless of the computation time. This paper proposes and evaluates two new algorithms, POMCPOW and PFT-DPW, that overcome this deficiency by using weighted particle filtering. Simulation results show that these modifications allow the algorithms to be successful where previous approaches fail.
机译:用于部分观察到的Markov决策过程的在线求解器已经应用于大型离散状态空间的问题,但是连续状态,动作和观察空间仍然是一个挑战。 本文首先调查双重逐步扩大(DPW)作为对此挑战的解决方案。 然而,我们证明,单独的这种修改是不够的,因为搜索树中的信仰表示塌陷到单个粒子,导致算法会聚到诸多的策略,无论计算时间如何。 本文提出并评估了两种新算法,POMCPOW和PFT-DPW,通过使用加权粒子滤波来克服这种缺陷。 仿真结果表明,这些修改允许算法成功,前面的方法失败。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号