首页> 外文期刊>Applied mathematics and optimization >Constrained Continuous-Time Markov Decision Processes on the Finite Horizon
【24h】

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

机译:限制连续时间马尔可夫决策过程对有限地平线

获取原文
获取原文并翻译 | 示例
           

摘要

This paper studies the constrained (nonhomogeneous) continuous-time Markov decision processes on the finite horizon. The performance criterion to be optimized is the expected total reward on the finite horizon, while N constraints are imposed on similar expected costs. Introducing the appropriate notion of the occupation measures for the concerned optimal control problem, we establish the following under some suitable conditions: (a) the class of Markov policies is sufficient; (b) every extreme point of the space of performance vectors is generated by a deterministic Markov policy; and (c) there exists an optimal Markov policy, which is a mixture of no more than N + 1 deterministic Markov policies.
机译:本文研究了有限地平线上的受约束(非均匀)连续时间马尔可夫决策过程。 优化的性能标准是有限地平线上的预期总奖励,而N约束则对类似的预期成本施加。 介绍了有关最佳控制问题的占用措施的适当概念,我们在一些合适的条件下建立了以下内容:(a)马尔可夫政策的班级就足够了; (b)绩效载体空间的每个极端点是由确定性马尔可夫政策产生的; (c)存在最佳的马尔可夫政策,这是不超过N + 1确定式马尔可夫政策的混合物。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号