...
首页> 外文期刊>Mathematical methods of operations research >Nonstationary denumerable state Markov Decision Processes - with average variance criterion
【24h】

Nonstationary denumerable state Markov Decision Processes - with average variance criterion

机译:非平稳可数状态马尔可夫决策过程-具有平均方差准则

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we consider the nonstationary Markov decision processes (MDP, for short) with average variance criterion on a countable state space, finite action spaces and bounded one-step rewards. From the optimality equations which are provided in this paper, we translate the average variance criterion into a new average expected cost criterion. Then we prove that there exists a Markov Policy, Which is optimal in an original average expected reward criterion, that minimizies the average variance in the class of optimal policies for the original average expected reward criterion.
机译:在本文中,我们考虑了在可数状态空间,有限作用空间和有界单步奖励下具有平均方差准则的非平稳Markov决策过程(简称MDP)。根据本文提供的最优性方程,我们将平均方差准则转换为新的平均预期成本准则。然后,我们证明存在一个马尔可夫策略,该策略在原始平均期望奖励准则中是最佳的,它可以将针对原始平均期望奖励准则的最优策略类别中的平均方差最小化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号