【24h】

Using core beliefs for point-based value iteration

机译:使用核心信念进行基于点的价值迭代

获取原文
获取原文并翻译 | 示例

摘要

Recent research on point-based approximation algorithms for POMDPs demonstrated that good solutions to POMDP problems can be obtained without considering the entire belief simplex. For instance, the Point Based Value Iteration (PBVI) algorithm [Pineau et al., 2003] computes the value function only for a small set of belief states and it-eratively adds more points to the set as needed. A key component of the algorithm is the strategy for selecting belief points, such that the space of reachable beliefs is well covered. This paper presents a new method for selecting an initial set of representative belief points, which relies on finding first the basis for the reachable belief simplex. Our approach has better worst-case performance than the original PBVI heuristic, and performs well in several standard POMDP tasks.
机译:对POMDP的基于点的近似算法的最新研究表明,无需考虑整个信念单纯性,就可以获得针对POMDP问题的良好解决方案。例如,基于点的值迭代(PBVI)算法[Pineau et al。,2003]仅针对一小部分信念状态计算值函数,并根据需要迭代地向该集合添加更多点。该算法的关键组成部分是选择置信点的策略,以便可以覆盖可置信的空间。本文提出了一种新的方法,用于选择代表代表性信念点的初始集合,该方法首先要找到可达到的信念单纯形的基础。我们的方法比原始的PBVI启发式方法具有更好的最坏情况性能,并且在几个标准POMDP任务中表现良好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号