首页> 外文期刊>Mathematics of operations research >Partially Observable Risk-Sensitive Markov Decision Processes
【24h】

Partially Observable Risk-Sensitive Markov Decision Processes

机译:部分可观察到的风险敏感的马尔可夫决策过程

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We consider the problem of minimizing a certainty equivalent of the total or discounted cost over a finite and an infinite time horizon that is generated by a partially observable Markov decision process (POMDP). In contrast to a risk-neutral decision maker, this optimization criterion takes the variability of the cost into account. It contains as a special case the classical risk-sensitive optimization criterion with an exponential utility. We show that this optimization problem can be solved by embedding the problem into a completely observable Markov decision process with extended state space and give conditions under which an optimal policy exists. The state space has to be extended by the joint conditional distribution of current unobserved state and accumulated cost. In case of an exponential utility, the problem simplifies considerably and we rediscover what in previous literature has been named information state. However, since we do not use any change of measure techniques here, our approach is simpler. A simple example, namely, a risk-sensitive Bayesian house selling problem, is considered to illustrate our results.
机译:我们考虑通过部分可观察到的马尔可夫决策过程(POMDP)产生的有限和无限时间地平线来最小化总体或折扣成本的确定性最小化的问题。与风险中立决策者相比,这种优化标准将成本的可变性占据了账户。它包含一个特殊情况,具有指数效用的经典风险敏感优化标准。我们表明,通过将问题嵌入到具有扩展状态空间的完全可观察的马尔可夫决策过程中,可以解决这种优化问题,并提供存在最佳政策的条件。状态空间必须通过当前不观察状态的关节条件分布和累积成本来延长。在指数实用程序的情况下,问题很大程度上简化了,我们重新发现以前的文献已被命名为信息状态。但是,由于我们在这里不使用任何测量技术的变化,我们的方法更简单。一个简单的例子,即风险敏感的贝叶斯房屋出售问题,被认为是说明我们的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号