首页> 外文期刊>SIAM Journal on Control and Optimization >A CONVEX ANALYTIC APPROACH TO RISK-AWARE MARKOV DECISION PROCESSES
【24h】

A CONVEX ANALYTIC APPROACH TO RISK-AWARE MARKOV DECISION PROCESSES

机译:风险感知马尔可夫决策过程的凸分析方法

获取原文
获取原文并翻译 | 示例
           

摘要

In classical Markov decision process (MDP) theory, we search for a policy that, say, minimizes the expected infinite horizon discounted cost. Expectation is, of course, a risk neutral measure, which does not suffice in many applications, particularly in finance. We replace the expectation with a general risk functional, and call such models risk-aware MDP models. We consider minimization of such risk functionals in two cases, the expected utility framework, and conditional value-at-risk, a popular coherent risk measure. Later, we consider risk-aware MDPs wherein the risk is expressed in the constraints. This includes stochastic dominance constraints, and the classical chance-constrained optimization problems. In each case, we develop a convex analytic approach to solve such risk-aware MDPs. In most cases, we show that the problem can be formulated as an infinite-dimensional linear program (LP) in occupation measures when we augment the state space. We provide a discretization method and finite approximations for solving the resulting LPs. A striking result is that the chance-constrained MDP problem can be posed as an LP via the convex analytic method.
机译:在经典的马尔可夫决策过程(MDP)理论中,我们寻找一种策略,例如,将预期的无限期折现成本最小化。当然,期望是一种风险中性措施,在许多应用中,特别是在金融领域,这是不够的。我们用一般风险功能代替了预期,并称此类模型为风险感知MDP模型。我们考虑在两种情况下将此类风险功能最小化:预期效用框架和有条件的风险价值(一种流行的一致风险度量)。后来,我们考虑了风险感知型MDP,其中在约束中表达了风险。这包括随机优势约束和经典的机会约束优化问题。在每种情况下,我们都开发了一种凸分析方法来解决此类具有风险意识的MDP。在大多数情况下,我们表明,当我们扩大状态空间时,该问题可以用占领措施中的无穷维线性规划(LP)表示。我们提供了一种离散化方法和有限近似值来求解所得的LP。引人注目的结果是,可以通过凸分析方法将机会受限的MDP问题提出为LP。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号