基于杂合标准的POMDP值迭代求解算法

刘峰

首页> 中文期刊> 《模式识别与人工智能》 >基于杂合标准的POMDP值迭代求解算法

基于杂合标准的POMDP值迭代求解算法

AI论文写作 >>

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Point-based value iteration methods are a kind of algorithms for effectively solving partially observable Markov decision process ( POMDP) model. However, the algorithm efficiency is limited by the belief point set explored in most of the algorithms by single heuristic criterion. A hybrid heuristic value iteration algorithm ( HHVI) for exploring belief point set is presented in this paper. The upper and lower bounds on the value function are maintained and only the belief points with its value function bounds difference greater than the threshold are selected to expand. Furthermore, the furthest belief point away from the explored point set among the subsequent belief points with the above difference also greater than the threshold is explored. The convergence effect of HHVI is guaranteed by making the explored point set fully distributed in the reachable belief space. Experimental results of four benchmarks show that HHVI can guarantee the convergence efficiency and obtain better global optimal solution.%基于点的值迭代方法是求解部分可观测马尔科夫决策过程(POMDP)问题的一类有效算法.目前基于点的值迭代算法大都基于单一启发式标准探索信念点集,从而限制算法效果.基于此种情况,文中提出基于杂合标准探索信念点集的值迭代算法(HHVI),可以同时维持值函数的上界和下界.在扩展探索点集时,选取值函数上下界差值大于阈值的信念点进行扩展,并且在值函数上下界差值大于阈值的后继信念点中选择与已探索点集距离最远的信念点进行探索,保证探索点集尽量有效分布于可达信念空间内.在4个基准问题上的实验表明,HHVI能保证收敛效率,并能收敛到更好的全局最优解.

著录项

来源
《模式识别与人工智能》 |2016年第11期|961-968|共8页
作者
刘峰;
展开▼
作者单位

南京大学软件学院南京210093;

南京大学计算机软件新技术国家重点实验室南京210093;

展开▼
原文格式 PDF
正文语种 chi
中图分类专用应用软件;
关键词
部分可观测马尔科夫决策过程(POMDP); 杂合启发式值迭代; 可达信念空间; 探索价值;

相似文献

中文文献
外文文献
专利

1. 基于环境状态分布优化的POMDP值迭代求解算法 [J] . 朱荣鑫 ,王譞 ,刘峰 . 计算机应用研究 . 2022,第2期
2. 基于策略迭代和值迭代的POMDP算法 [J] . 孙湧 ,仵博 ,冯延蓬 . 计算机研究与发展 . 2008,第010期
3. 基于循环卷积神经网络的POMDP值迭代算法 [J] . 于丹宁 ,倪坤 ,刘云龙 . 计算机工程 . 2021,第002期
4. 一种基于最优策略概率分布的 POMDP 值迭代算法 [J] . 刘峰 ,王崇骏 ,骆斌 . 电子学报 . 2016,第005期
5. 基于点的POMDPs在线值迭代算法 [J] . 仵博 ,吴敏 ,佘锦华 . 软件学报 . 2013,第001期
6. 矩阵特征值与多特征值问题牛顿法与Rayleigh商迭代法的一些数值问题 [C] . 征道生 . 中国数学会第四届全国最优化数值方法学术会 . 1987
7. 基于点的值迭代算法在POMDP问题中的研究 [A] . 房俊恒 . 2015

基于杂合标准的POMDP值迭代求解算法

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅