Partially Observable Risk-Sensitive Markov Decision Processes

Baeuerle Nicole; Rieder Ulrich

首页> 外文期刊>Mathematics of operations research >Partially Observable Risk-Sensitive Markov Decision Processes

【24h】

Partially Observable Risk-Sensitive Markov Decision Processes

机译：部分可观察到的风险敏感的马尔可夫决策过程

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We consider the problem of minimizing a certainty equivalent of the total or discounted cost over a finite and an infinite time horizon that is generated by a partially observable Markov decision process (POMDP). In contrast to a risk-neutral decision maker, this optimization criterion takes the variability of the cost into account. It contains as a special case the classical risk-sensitive optimization criterion with an exponential utility. We show that this optimization problem can be solved by embedding the problem into a completely observable Markov decision process with extended state space and give conditions under which an optimal policy exists. The state space has to be extended by the joint conditional distribution of current unobserved state and accumulated cost. In case of an exponential utility, the problem simplifies considerably and we rediscover what in previous literature has been named information state. However, since we do not use any change of measure techniques here, our approach is simpler. A simple example, namely, a risk-sensitive Bayesian house selling problem, is considered to illustrate our results.

机译：我们考虑通过部分可观察到的马尔可夫决策过程（POMDP）产生的有限和无限时间地平线来最小化总体或折扣成本的确定性最小化的问题。与风险中立决策者相比，这种优化标准将成本的可变性占据了账户。它包含一个特殊情况，具有指数效用的经典风险敏感优化标准。我们表明，通过将问题嵌入到具有扩展状态空间的完全可观察的马尔可夫决策过程中，可以解决这种优化问题，并提供存在最佳政策的条件。状态空间必须通过当前不观察状态的关节条件分布和累积成本来延长。在指数实用程序的情况下，问题很大程度上简化了，我们重新发现以前的文献已被命名为信息状态。但是，由于我们在这里不使用任何测量技术的变化，我们的方法更简单。一个简单的例子，即风险敏感的贝叶斯房屋出售问题，被认为是说明我们的结果。

著录项

来源
《Mathematics of operations research》 |2017年第4期|共17页
作者
Baeuerle Nicole; Rieder Ulrich;
展开▼
作者单位

Karlsruhe Inst Technol Dept Math D-76128 Karlsruhe Germany;

Univ Ulm D-89069 Ulm Germany;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类运筹学;
关键词
partially observable Markov decision problem; certainty equivalent; exponential utility; updating operator; value iteration;

机译：部分可观察的马尔可夫决策问题;确定性等同物;指数效用;更新运营商;价值迭代;

相似文献

外文文献
中文文献
专利

1. Partially Observable Risk-Sensitive Markov Decision Processes [J] . Baeuerle Nicole, Rieder Ulrich Mathematics of operations research . 2017,第4期

机译：部分可观察到的风险敏感的马尔可夫决策过程
2. Monotonicity properties for two-action partially observable Markov decision processes on partially ordered spaces [J] . European Journal of Operational Research . 2020,第3期

机译：两个动作部分可观察到的Markov决策过程的单调性属性在部分有序空间上
3. Partially observable Markov decision processes for optimal operations of gas transmission networks [J] . Compare Michele, Baraldi Piero, Marelli Paolo, Reliability Engineering & System Safety . 2020,第Jula期

机译：用于燃气传输网络的最佳操作的部分可观察的马尔可夫决策过程
4. RE-STORM: Mapping the Decision-Making Problem and Non-functional Requirements Trade-Off to Partially Observable Markov Decision Processes [C] . Luis Hernan Garcia Paucar, Nelly Bencomo International Symposium on Software Engineering for Adaptive and Self-Managing Systems . 2018

机译：RE-STORM：将决策问题和非功能需求折衷映射到部分可观察的马尔可夫决策过程
5. Modern Methods of Hidden Markov Models and Partially Observable Markov Decision Processes in Biostatistics [D] . Xu, Zekun. 2020

机译：隐藏马尔可夫模型的现代方法和止痛性的部分可观察马尔可夫决策过程
6. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes [O] . Rajesh P. N. Rao 2010

机译：不确定性下的决策：基于部分可观察的马尔可夫决策过程的神经模型
7. Partially Observable Risk-Sensitive Markov Decision Processes [O] . Bäuerle, Nicole, Rieder, Ulrich 2016

机译：部分可观察的风险敏感马尔可夫决策过程

Partially Observable Risk-Sensitive Markov Decision Processes

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅