首页> 外文会议>International Conference on Operations Research >Algorithmic Procedures for Mean Variance Optimality in Markov Decision Chains
【24h】

Algorithmic Procedures for Mean Variance Optimality in Markov Decision Chains

机译:马尔可夫决策链中平均方差最优性的算法过程

获取原文

摘要

In this note we discuss some algorithmic procedures for finding optimal policies of Markov decision chains with respect to various mean variance optimality criteria. To this end, we present formulas for the growth rate and asymptotic behavior of the variance of total cumulative reward. Finally, algorithmic procedures of policy iteration type for finding efficient policies with respect to various mean variance optimality criteria along with computational experience are discussed.
机译:在本说明中,我们讨论了一些算法过程,用于查找关于各种平均方差最优性标准的马尔可夫决策链的最佳策略。为此,我们为总累积奖励方差的增长率和渐近行为提供公式。最后,讨论了用于查找关于各种平均方差最优性标准的有效策略的政策迭代类型的算法过程以及计算经验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号