Conditions for the uniqueness of optimal policies of discounted Markov decision processes

Cruz-Suarez D; Montes-de-Oca R; Salem-Silva F

首页> 外文期刊>Mathematical methods of operations research >Conditions for the uniqueness of optimal policies of discounted Markov decision processes

【24h】

Conditions for the uniqueness of optimal policies of discounted Markov decision processes

机译：折扣马尔可夫决策过程的最优策略唯一性的条件

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents three conditions. Each of them guarantees the uniqueness of optimal policies of discounted Markov decision processes. The conditions presented here impose hypotheses specifically on the state space X, the action space A, the admissible action sets A (x), x c X, the transition probability Q, and on the cost function c. Two of these conditions require mainly convexity assumptions, but the third one does not need this kind of assumptions. However, it needs certain stochastic order relations in Q, and the cost function c to reach its minimum with respect to the actions, just in one action. We illustrate the conditions with several examples including, in particular, discrete models, the linear regulator problem, and also a model of an inventory control system.

机译：本文提出了三个条件。它们每个都保证了折现马尔可夫决策过程的最优策略的唯一性。此处给出的条件专门针对状态空间X，操作空间A，可允许操作集A（x），x c X，转移概率Q和成本函数c进行假设。其中两个条件主要需要凸性假设，但第三个条件不需要这种假设。但是，它需要在Q中具有一定的随机顺序关系，并且成本函数c要在一个动作中相对于这些动作达到最小值。我们用几个示例来说明这种情况，这些示例尤其包括离散模型，线性调节器问题以及库存控制系统模型。

著录项

来源
《Mathematical methods of operations research》 |2004年第3期|共22页
作者
Cruz-Suarez D; Montes-de-Oca R; Salem-Silva F;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类数学;
关键词
discounted Markov decision processes; uniqueness of optimal policies; convexity; Stochastic order; PLANNING-HORIZONS; EXISTENCE; FORECAST;

机译：折扣马尔可夫决策过程;最优策略的唯一性;凸性;随机序;规划水平;存在性;预测;

相似文献

外文文献
中文文献
专利

1. Conditions for the uniqueness of optimal policies of discounted Markov decision processes [J] . Cruz-Suarez D, Montes-de-Oca R, Salem-Silva F Mathematical methods of operations research . 2004,第3期

机译：折扣马尔可夫决策过程的最优策略唯一性的条件
2. Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach [J] . Lemus-Rodríguez Enrique, Montes-de-Oca Raúl, Ortega-Gutiérrez R. Israel Kybernetika . 2016,第1期

机译：最优政策作为折现马尔科夫决策过程的通用属性的唯一性：Ekeland的变分原理方法
3. Nonuniqueness versus uniqueness of optimal policies in convex discounted markov decision processes [J] . Montes-De-Oca R., Lemus-Rodríguez E., Salem-Silva F.S. Journal of applied mathematics . 2013,第Pta1期

机译：凸折扣马尔可夫决策过程中最优策略的非唯一性与唯一性
4. Discounted Optimality for Continuous-Time Markov Decision Processes in Polish Spaces [C] . Xianping Guo . 2006

机译：波兰空间中连续时间马尔可夫决策过程的折扣最优性
5. Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design [D] . Mason, Jennifer Elizabeth 2012

机译：马尔可夫决策过程和近似动态规划方法进行最优处理设计
6. Evolving Robust Policy Coverage Sets in Multi-Objective Markov Decision Processes Through Intrinsically Motivated Self-Play [O] . Sherif Abdelfattah, Kathryn Kasmarik, Jiankun Hu 2018

机译：通过内在动机的自我博弈在多目标马尔可夫决策过程中发展稳健的政策覆盖范围
7. Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach [O] . R. Israel Ortega-Gutiérrez, Raúl Montes-de-Oca, Enrique Lemus-Rodríguez 2016

机译：最佳政策的独特性作为折扣马尔可夫决策流程的通用财产：ekeland的变分原理方法
8. Contraction Conditions for Average and alpha-Discount Optimality in CountableState Markov Games with Unbounded Rewards [R] . Altman, E., Hordijk, A., Spieksma, F. M. 1994

机译：具有无界奖励的Countablestate markov游戏中平均和alpha折扣最优性的收缩条件

Conditions for the uniqueness of optimal policies of discounted Markov decision processes

摘要

著录项

相似文献

相关主题

期刊订阅