首页> 外文会议>World Congress on Intelligent Control and Automation >A unified approach for semi-Markov decision processes with discounted and average reward criteria
【24h】

A unified approach for semi-Markov decision processes with discounted and average reward criteria

机译:具有折扣和平均奖励标准的半马尔可夫决策过程的统一方法

获取原文

摘要

On the basis of the sensitivity-based optimization, we develop a unified optimization approach for semi-Markov decision processes (SMDPs) with infinite horizon discounted and average reward criteria. We show that the sensitivity formula under average reward criteria is a limitation case of discounted reward criteria. On the basis of the performance sensitivity formulas, we provide a unified formulation for the policy iteration algorithms of semi-Markov decision processes with discounted and average reward criteria.
机译:在基于灵敏度的优化的基础上,我们针对具有无限期折扣和平均奖励标准的半马尔可夫决策过程(SMDP)开发了统一的优化方法。我们表明,平均奖励标准下的敏感性公式是折现奖励标准的一个局限情况。基于性能敏感性公式,我们为具有折扣和平均奖励标准的半马尔可夫决策过程的策略迭代算法提供了统一的表述。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号