首页> 外文期刊>Frontiers of mathematics in China >First passage Markov decision processes with constraints and varying discount factors
【24h】

First passage Markov decision processes with constraints and varying discount factors

机译:具有约束和可变折现率的第一代马尔可夫决策过程

获取原文
获取原文并翻译 | 示例
       

摘要

This paper focuses on the constrained optimality problem (COP) of first passage discrete-time Markov decision processes (DTMDPs) in denumerable state and compact Borel action spaces with multi-constraints, state-dependent discount factors, and possibly unbounded costs. By means of the properties of a so-called occupation measure of a policy, we show that the constrained optimality problem is equivalent to an (infinite-dimensional) linear programming on the set of occupation measures with some constraints, and thus prove the existence of an optimal policy under suitable conditions. Furthermore, using the equivalence between the constrained optimality problem and the linear programming, we obtain an exact form of an optimal policy for the case of finite states and actions. Finally, as an example, a controlled queueing system is given to illustrate our results.
机译:本文着重于可数状态和具有多约束,状态相关折扣因子以及可能无穷大成本的紧致Borel动作空间中的第一遍离散时间马尔可夫决策过程(DTMDP)的约束最优性问题(COP)。通过策略的所谓占用度量的性质,我们证明了约束最优性问题等同于带有约束的一组占用度量的(无限维)线性规划,从而证明了存在在适当条件下的最佳政策。此外,利用约束的最优性问题和线性规划之间的等价关系,我们获得了有限状态和作用情况下最优策略的精确形式。最后,以一个示例为例,给出了一个受控排队系统来说明我们的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号