...
首页> 外文期刊>Journal of Mathematical Analysis and Applications >Stochastic approximations of constrained discounted Markov decision processes
【24h】

Stochastic approximations of constrained discounted Markov decision processes

机译:约束折扣马尔可夫决策过程的随机逼近

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We consider a discrete-time constrained Markov decision process under the discounted cost optimality criterion. The state and action spaces are assumed to be Borel spaces, while the cost and constraint functions might be unbounded. We are interested in approximating numerically the optimal discounted constrained cost. To this end, we suppose that the transition kernel of the Markov decision process is absolutely continuous with respect to some probability measure μ. Then, by solving the linear programming formulation of a constrained control problem related to the empirical probability measure μ_n of μ, we obtain the corresponding approximation of the optimal constrained cost. We derive a concentration inequality which gives bounds on the probability that the estimation error is larger than some given constant. This bound is shown to decrease exponentially in n. Our theoretical results are illustrated with a numerical application based on a stochastic version of the Beverton-Holt population model.
机译:我们考虑了折现成本最优准则下的离散时间马尔可夫决策过程。状态空间和动作空间假定为Borel空间,而成本函数和约束函数可能不受限制。我们对数值上最佳折现约束成本感兴趣。为此,我们假设马尔可夫决策过程的转移核相对于某些概率测度μ是绝对连续的。然后,通过求解与经验概率测度μ_n有关的约束控制问题的线性规划公式,我们可以获得最优约束成本的相应近似值。我们得出浓度不等式,该不等式限制了估计误差大于某个给定常数的概率。该边界在n中呈指数下降。我们的理论结果通过基于Beverton-Holt人口模型的随机版本的数值应用得到了说明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号