首页> 外文期刊>Mathematical methods of operations research >Conditions for the uniqueness of optimal policies of discounted Markov decision processes
【24h】

Conditions for the uniqueness of optimal policies of discounted Markov decision processes

机译:折扣马尔可夫决策过程的最优策略唯一性的条件

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents three conditions. Each of them guarantees the uniqueness of optimal policies of discounted Markov decision processes. The conditions presented here impose hypotheses specifically on the state space X, the action space A, the admissible action sets A (x), x c X, the transition probability Q, and on the cost function c. Two of these conditions require mainly convexity assumptions, but the third one does not need this kind of assumptions. However, it needs certain stochastic order relations in Q, and the cost function c to reach its minimum with respect to the actions, just in one action. We illustrate the conditions with several examples including, in particular, discrete models, the linear regulator problem, and also a model of an inventory control system.
机译:本文提出了三个条件。它们每个都保证了折现马尔可夫决策过程的最优策略的唯一性。此处给出的条件专门针对状态空间X,操作空间A,可允许操作集A(x),x c X,转移概率Q和成本函数c进行假设。其中两个条件主要需要凸性假设,但第三个条件不需要这种假设。但是,它需要在Q中具有一定的随机顺序关系,并且成本函数c要在一个动作中相对于这些动作达到最小值。我们用几个示例来说明这种情况,这些示例尤其包括离散模型,线性调节器问题以及库存控制系统模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号