Stochastic approximations of constrained discounted Markov decision processes

Fran?ois Dufour; Tomás Prieto-Rumeau

首页> 外文期刊>Journal of Mathematical Analysis and Applications >Stochastic approximations of constrained discounted Markov decision processes

【24h】

Stochastic approximations of constrained discounted Markov decision processes

机译：约束折扣马尔可夫决策过程的随机逼近

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider a discrete-time constrained Markov decision process under the discounted cost optimality criterion. The state and action spaces are assumed to be Borel spaces, while the cost and constraint functions might be unbounded. We are interested in approximating numerically the optimal discounted constrained cost. To this end, we suppose that the transition kernel of the Markov decision process is absolutely continuous with respect to some probability measure μ. Then, by solving the linear programming formulation of a constrained control problem related to the empirical probability measure μ_n of μ, we obtain the corresponding approximation of the optimal constrained cost. We derive a concentration inequality which gives bounds on the probability that the estimation error is larger than some given constant. This bound is shown to decrease exponentially in n. Our theoretical results are illustrated with a numerical application based on a stochastic version of the Beverton-Holt population model.

机译：我们考虑了折现成本最优准则下的离散时间马尔可夫决策过程。状态空间和动作空间假定为Borel空间，而成本函数和约束函数可能不受限制。我们对数值上最佳折现约束成本感兴趣。为此，我们假设马尔可夫决策过程的转移核相对于某些概率测度μ是绝对连续的。然后，通过求解与经验概率测度μ_n有关的约束控制问题的线性规划公式，我们可以获得最优约束成本的相应近似值。我们得出浓度不等式，该不等式限制了估计误差大于某个给定常数的概率。该边界在n中呈指数下降。我们的理论结果通过基于Beverton-Holt人口模型的随机版本的数值应用得到了说明。

著录项

来源
《Journal of Mathematical Analysis and Applications》 |2014年第2期|共24页
作者
Fran?ois Dufour; Tomás Prieto-Rumeau;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Constrained Markov decision processes; Linear programming approach to control; problems; Approximation of Markov decision; processes;

机译：约束马尔可夫决策过程;线性规划控制方法;问题;马尔可夫决策逼近;过程;

相似文献

外文文献
中文文献
专利

1. Stochastic approximations of constrained discounted Markov decision processes [J] . Fran?ois Dufour, Tomás Prieto-Rumeau Journal of Mathematical Analysis and Applications . 2014,第2期

机译：约束折扣马尔可夫决策过程的随机逼近
2. Finite linear programming approximations of constrained discounted markov decision processes [J] . Dufour F., Prieto-Rumeau T. SIAM Journal on Control and Optimization . 2013,第2期

机译：约束折扣马尔可夫决策过程的有限线性规划近似
3. On discounted approximations of undiscounted stochastic games and Markov decision processes with limited randomness [J] . Boros E., Elbassioni K., Gurvich V., Operations Research Letters: A Journal of the Operations Research Society of America . 2013,第4期

机译：有限随机性下无折扣随机博弈的折现近似和马尔可夫决策过程
4. Policy gradient stochastic approximation algorithms for adaptive control of constrained time varying Markov decision processes [C] . Abad, F.J.V., Krishnamurthy, . 2003

机译：受限时变马尔可夫决策过程自适应控制的策略梯度随机逼近算法
5. Linear approximations for factored Markov decision processes. [D] . Patrascu, Relu-Eugen. 2005

机译：因子马尔可夫决策过程的线性近似。
6. Data-Driven Markov Decision Process Approximations for PersonalizedHypertension Treatment Planning [O] . Greggory J. Schell, Wesley J. Marrero, Mariel S. Lavieri, 2016

机译：数据驱动的个性化马尔可夫决策过程近似高血压治疗计划
7. Finite State Approximations for Countable State Infinite Horizon Discounted Markov Decision Processes [O] . Sjur D. Flåm 1987

机译：可数状态无限时空折扣马尔可夫决策过程的有限状态逼近
8. I,II Convergence and Rate of Convergence Theorems for Constrained and Unconstrained Stochastic Approximation,via Weak Convergence Methods. III Numerical Studies for Constrained Stochastic Approximation Problems, [R] . kushner,harold j. lakshmivarahan, s. 1977

机译：I，II收敛性和受约束和无约束随机逼近的收敛速度定理，通过弱收敛方法。 III约束随机逼近问题的数值研究，

Stochastic approximations of constrained discounted Markov decision processes

摘要

著录项

相似文献

相关主题

期刊订阅