Bounded-Parameter Partially Observable Markov Decision Processes

机译：有界参数部分可观察的马尔可夫决策过程

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The POMDP is considered as a powerful model for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model precisely the real-life situations, due to various reasons such as limited data for learning the model, etc. In this paper, assuming that the parameters of POMDPs are imprecise but bounded, we formulate the framework of bounded-parameter partially observable Markov decision processes (BPOMDPs). A modified value iteration is proposed as a basic strategy for tackling parameter imprecision in BPOMDPs. In addition, we design the UL-based value iteration algorithm, in which each value backup is based on two sets of vectors called U-set and L-set. We propose four typical strategies for setting U-set and L-set, and some of them guarantee that the modified value iteration is implemented through the algorithm. We analyze theoretically the computational complexity and the reward loss of the algorithm. The effectiveness and robustness of the algorithm are revealed by empirical studies.

机译：POMDP被认为是在不确定情况下进行规划的强大模型。但是，由于各种原因（例如，学习模型的数据有限等），使用带有精确参数的POMDP来精确地模拟现实情况通常是不切实际的。在本文中，假设POMDP的参数不精确，但是在有界的情况下，我们制定了有界参数部分可观察的马尔可夫决策过程（BPOMDP）的框架。提出了修改后的值迭代作为解决BPOMDP中参数不精确性的基本策略。此外，我们设计了基于UL的值迭代算法，其中每个值备份都基于两组向量（分别称为U集和L集）。我们提出了四种设置U集和L集的典型策略，其中一些策略可以确保通过算法实现修改后的值迭代。我们从理论上分析了算法的计算复杂度和报酬损失。实证研究表明了该算法的有效性和鲁棒性。

著录项

来源
《Proceedings of the Eighteenth international conference on automated planning and scheduling》|2008年|240-247|共8页
会议地点 Sydney(AU);Sydney(AU)
作者
Yaodong Ni; Zhi-Qiang Liu;
展开▼
作者单位

School of Creative Media City University of Hong Kong Tat Chee Avenue, Kowloon, Hong Kong, China;

School of Creative Media City University of Hong Kong Tat Chee Avenue, Kowloon, Hong Kong, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. BOUNDED-PARAMETER PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES: FRAMEWORK AND ALGORITHM [J] . YAODONG NI, ZHI-QIANG LIU International Journal of Uncertainty, Fuzziness, and Knowledge-based Systems . 2013,第6期

机译：有界参数部分可观察的马尔可夫决策过程：框架和算法
2. The Optimal Observability of Partially Observable Markov Decision Processes: Discrete State Space [J] . Rezaeian M.Vo B.-N.Evans J. S. Automatic Control, IEEE Transactions on . 2010,第12期

机译：部分可观马尔可夫决策过程的最优可观性：离散状态空间
3. Monotonicity properties for two-action partially observable Markov decision processes on partially ordered spaces [J] . European Journal of Operational Research . 2020,第3期

机译：两个动作部分可观察到的Markov决策过程的单调性属性在部分有序空间上
4. Bounded-Parameter Partially Observable Markov Decision Processes [C] . Yaodong Ni, Zhi-Qiang Liu International conference on automated planning and scheduling . 2008

机译：有界参数部分可观察的马尔可夫决策过程
5. Modern Methods of Hidden Markov Models and Partially Observable Markov Decision Processes in Biostatistics [D] . Xu, Zekun. 2020

机译：隐藏马尔可夫模型的现代方法和止痛性的部分可观察马尔可夫决策过程
6. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes [O] . Rajesh P. N. Rao 2010

机译：不确定性下的决策：基于部分可观察的马尔可夫决策过程的神经模型
7. Monotonicity properties for two-action partially observable Markov decision processes on partially ordered spaces [O] . Erik Miehling, Demosthenis Teneketzis 2020

机译：两个动作部分可观察到的Markov决策过程的单调性属性在部分有序空间上

Bounded-Parameter Partially Observable Markov Decision Processes

摘要

著录项

相似文献

相关主题

期刊订阅