Decomposing large-scale POMDP via belief state analysis

机译：通过信仰状态分析分解大规模的POMDP

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Partially observable Markov decision process (POMDP) is commonly used to model a stochastic environment with unobservable states for supporting optimal decision making. Computing the optimal policy for a large-scale POMDP is known to be intractable. Belief compression, being an approximate solution, has recently been proposed to reduce the dimension of POMDP's belief state space and shown to be effective in improving the problem tractability. In this paper, with the conjecture that temporally close belief states could be characterized by a lower intrinsic dimension, we propose a spatio-temporal brief clustering that considers both the belief states' spatial (in the belief space) and temporal similarities, as well as incorporate it into the belief compression algorithm. The proposed clustering results in belief state clusters as sub-POMDPs of much lower dimension so as to be distributed to a set of distributed agents for collaborative problem solving. The proposed method has been tested using a synthesized navigation problem (Hallway2) and empirically shown to be able to result in policies of superior long-term rewards when compared with those based on solely belief compression. Some future research directions for extending this belief state analysis approach are also included.

机译：部分观察到的马尔可夫决策过程（POMDP）通常用于模拟随意的状态，以支持最佳决策。已知计算大规模POMDP的最佳策略是难以相解的。最近提出了近似解决方案的信念压缩，以减少POMDP信仰状态空间的维度，并显示有效地改善问题途径。在本文中，随着猜想的情况下，时间近乎信仰状态可以通过较低的内在维度来表征，提出了一种审视信仰状态的时空短暂的聚类，以既有信仰国家的空间（在信仰空间）和时间相似之处，以及将其与信仰压缩算法合并。所提出的聚类导致信仰状态集群作为低于尺寸的子POMDP，以便分布到一组分布式代理以进行协作问题。已经使用合成导航问题（Hallway2）测试了所提出的方法，并经验证明能够在与基于完全信仰压缩的那些相比时能够导致优越的长期奖励的政策。还包括一些未来的延长这种信仰状态分析方法的研究方向。

著录项

来源
《IEEE/WIC/ACM International Conference on Intelligent Agent Technology》|2005年||共7页
会议地点
作者
Xin Li; Cheung W.K.; Jiming Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词
Markov processes; belief maintenance; decision making; multi-agent systems; problem solving; belief compression; belief state analysis; collaborative problem solving; distributed agent; optimal decision making; partially observable Markov decision process; spatio-t;

机译：马尔可夫进程;信仰维护;决策;多委托制度;问题解决;信仰压缩;信仰状态分析;协作问题解决;分布式代理;最佳决策;部分观察到的马尔可夫决策过程;SPATIO-T;

相似文献

外文文献
中文文献
专利

1. Real user evaluation of a POMDP spoken dialogue system using automatic belief compression [J] . Paul A. Crook, Simon Keizer, Zhuoran Wang, Computer speech and language . 2014,第4期

机译：使用自动信念压缩对POMDP口语对话系统进行真实用户评估
2. Finding Approximate POMDP solutions Through Belief Compression [J] . Gordon G., Roy N., Thrun S. The Journal of Artificial Intelligence Research . 2005,第12期

机译：通过置信度压缩找到近似的POMDP解决方案
3. Finding Approximate POMDP Solutions Through Belief Compression [J] . Nicholas Roy, Geoffrey Gordon, Sebastian Thrun The Journal of Artificial Intelligence Research . 2005,第0期

机译：通过信念压缩找到近似的POMDP解决方案
4. Decomposing large-scale POMDP via belief state analysis [C] . Xin Li, Cheung W.K., Jiming Liu Intelligent Agent Technology, IEEE/WIC/ACM International Conference on . 2005

机译：通过信念状态分析分解大规模POMDP
5. Identification, Decomposition and Analysis of Dynamic Large-Scale Structures in Turbulent Rayleigh-Benard Convection. [D] . Sakievich, Philip Sakievich. 2017

机译：湍流瑞利-贝纳德对流中动态大型结构的识别，分解和分析。
6. Theoretical Analysis of Heuristic Search Methods for Online POMDPs [O] . Stéphane Ross, Joelle Pineau, Brahim Chaib-draa -1

机译：在线POMDP启发式搜索方法的理论分析
7. Decomposing Large-Scale POMDP Via Belief State Analysis [O] . Xin Li, William K. Cheung, Jiming Liu 2013

机译：通过信念状态分析分解大规模pOmDp

Decomposing large-scale POMDP via belief state analysis

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅