Affine Monotonic and Risk-Sensitive Models in Dynamic Programming

Bertsekas Dimitri P.

首页> 外文期刊>IEEE Transactions on Automatic Control >Affine Monotonic and Risk-Sensitive Models in Dynamic Programming

【24h】

Affine Monotonic and Risk-Sensitive Models in Dynamic Programming

机译：动态规划中的仿射单调模型和风险敏感模型

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we consider a broad class of infinite horizon discrete-time optimal control models that involve a nonnegative cost function and an affine mapping in their dynamic programming equation. They include as special cases several classical models, such as stochastic undiscounted nonnegative cost problems, stochastic multiplicative cost problems, and risk-sensitive problems with exponential cost. We focus on the case where the state space is finite and the control space has some compactness properties, and we emphasize shortest path-type models. We assume that the affine mapping has a semicontractive character, whereby for some policies it is a contraction, whereas for others it is not. In one line of analysis, we impose assumptions guaranteeing that the noncontractive policies cannot be optimal. Under these assumptions, we prove strong results that resemble those for discounted Markovian decision problems, such as the uniqueness of solution of Bellman's equation, and the validity of forms of value and policy iteration. In the absence of these assumptions, the results are weaker and unusual in character: the optimal cost function need not be a solution of Bellman's equation, and may not be found by value or policy iteration. Instead the optimal cost function over just the contractive policies is the largest solution of Bellman's equation, and can be computed by a variety of algorithms.

机译：在本文中，我们考虑了一大类无限水平离散时间最优控制模型，这些模型在其动态规划方程中涉及非负成本函数和仿射映射。作为特殊情况，它们包括几种经典模型，例如随机无折现非负成本问题，随机可乘成本问题以及具有指数成本的风险敏感问题。我们关注状态空间是有限的并且控制空间具有某些紧凑性的情况，并且我们强调最短路径类型模型。我们假设仿射映射具有半收缩特性，因此对于某些策略它是收缩，而对于另一些策略则不是。在一项分析中，我们强加一些假设，以确保非契约性政策无法达到最优。在这些假设下，我们证明了强有力的结果，这些结果类似于折现的马尔可夫决策问题的结果，例如Bellman方程解的唯一性，价值形式和政策迭代的有效性。在没有这些假设的情况下，结果将较弱且性质不寻常：最优成本函数不必是Bellman方程的解决方案，并且可能无法通过价值或政策迭代找到。相反，仅基于收缩策略的最优成本函数是Bellman方程的最大解决方案，可以通过多种算法进行计算。

著录项

来源
《IEEE Transactions on Automatic Control》 |2019年第8期|3117-3128|共12页
作者
Bertsekas Dimitri P.;
展开▼
作者单位

MIT Lab Informat & Decis Syst Cambridge MA 02139 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Dynamic programming (DP); Markov decision processes; risk sensitive control; stochastic shortest paths;

机译：动态编程（DP）;马尔可夫决策过程;风险敏感控制;随机最短路径;

相似文献

外文文献
中文文献
专利

1. Probabilistically distorted risk-sensitive infinite-horizon dynamic programming [J] . Lin Kun, Jie Cheng, Marcus Steven I. Automatica . 2018,第期

机译：概率扭曲风险敏感无限的地平线动态规划
2. A note on negative dynamic programming for risk-sensitive control [J] . Jaskiewicz A Operations Research Letters: A Journal of the Operations Research Society of America . 2008,第5期

机译：有关风险敏感控制的负动态编程的说明
3. Nearly optimal policies in risk-sensitive positive dynamic programming on discrete spaces [J] . Rolando Cavazos-Cadena, Raul Montes-de-Oca Mathematical methods of operations research . 2000,第1期

机译：离散空间上风险敏感的正动态规划中的最佳策略
4. Search space reduction in dynamic programming using monotonic heuristics in the context of model predictive optimization [C] . Chevrant-Breton O., Tianyi Guan, Frey C.W. IEEE International Conference on Intelligent Transportation Systems . 2014

机译：在模型预测优化中使用单调启发式进行动态规划中的搜索空间缩减
5. A binary dynamic programming problem with affine transition and reward functions: Properties and algorithm. [D] . Gatica, Ricardo Antonio. 2003

机译：具有仿射过渡和奖励函数的二进制动态规划问题：属性和算法。
6. LK-DFBA: a linear programming-based modeling strategy for capturing dynamics and metabolite-dependent regulation in metabolism [O] . Robert A. Dromms, Justin Y. Lee, Mark P. Styczynski 2020

机译：LK-DFBA：基于线性编程的建模策略用于捕获代谢中的动力学和代谢物相关调节
7. Affine Monotonic and Risk-Sensitive Models in Dynamic Programming [O] . Dimitri P. Bertsekas 2019

机译：动态编程中仿射单调和风险敏感模型
8. Robust Coordination of Autonomous Systems through Risk-sensitive, Model-based Programming and Execution. [R] . Williams, B., Santana, P., Fang, C., 2015

机译：通过风险敏感，基于模型的编程和执行来实现自治系统的稳健协调。

Affine Monotonic and Risk-Sensitive Models in Dynamic Programming

摘要

著录项

相似文献

相关主题

期刊订阅