...
首页> 外文期刊>IEEE Transactions on Automatic Control >Affine Monotonic and Risk-Sensitive Models in Dynamic Programming
【24h】

Affine Monotonic and Risk-Sensitive Models in Dynamic Programming

机译:动态规划中的仿射单调模型和风险敏感模型

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, we consider a broad class of infinite horizon discrete-time optimal control models that involve a nonnegative cost function and an affine mapping in their dynamic programming equation. They include as special cases several classical models, such as stochastic undiscounted nonnegative cost problems, stochastic multiplicative cost problems, and risk-sensitive problems with exponential cost. We focus on the case where the state space is finite and the control space has some compactness properties, and we emphasize shortest path-type models. We assume that the affine mapping has a semicontractive character, whereby for some policies it is a contraction, whereas for others it is not. In one line of analysis, we impose assumptions guaranteeing that the noncontractive policies cannot be optimal. Under these assumptions, we prove strong results that resemble those for discounted Markovian decision problems, such as the uniqueness of solution of Bellman's equation, and the validity of forms of value and policy iteration. In the absence of these assumptions, the results are weaker and unusual in character: the optimal cost function need not be a solution of Bellman's equation, and may not be found by value or policy iteration. Instead the optimal cost function over just the contractive policies is the largest solution of Bellman's equation, and can be computed by a variety of algorithms.
机译:在本文中,我们考虑了一大类无限水平离散时间最优控制模型,这些模型在其动态规划方程中涉及非负成本函数和仿射映射。作为特殊情况,它们包括几种经典模型,例如随机无折现非负成本问题,随机可乘成本问题以及具有指数成本的风险敏感问题。我们关注状态空间是有限的并且控制空间具有某些紧凑性的情况,并且我们强调最短路径类型模型。我们假设仿射映射具有半收缩特性,因此对于某些策略它是收缩,而对于另一些策略则不是。在一项分析中,我们强加一些假设,以确保非契约性政策无法达到最优。在这些假设下,我们证明了强有力的结果,这些结果类似于折现的马尔可夫决策问题的结果,例如Bellman方程解的唯一性,价值形式和政策迭代的有效性。在没有这些假设的情况下,结果将较弱且性质不寻常:最优成本函数不必是Bellman方程的解决方案,并且可能无法通过价值或政策迭代找到。相反,仅基于收缩策略的最优成本函数是Bellman方程的最大解决方案,可以通过多种算法进行计算。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号