...
首页> 外文期刊>Journal of Mathematical Analysis and Applications >Bias optimality and strong n (n =-1, 0) discount optimality for Markov decision processes
【24h】

Bias optimality and strong n (n =-1, 0) discount optimality for Markov decision processes

机译:马尔可夫决策过程的偏最优性和强n(n = -1,0)折扣最优性

获取原文
获取原文并翻译 | 示例

摘要

In this paper we study both bias optimality and strong n (n = -1,0) discount optimality for denumerable discrete-time Markov decision processes. The rewards may have neither upper not lower bounds. We give sufficient conditions on the system's primitive data, and under which we prove (1) the existence of the bias optimality equation and bias optimal policies; (2) a condition equivalent to bias optimal policies; (3) average expected reward optintality and strong -1-discount optirnality are equivalent; (4) bias optimality and strong 0-discount optirnality are equivalent; (5) the existence of strong n (n = -1,0) discount optimal stationary policies. Our conditions are weaker than those in the previous literature. Moreover, our results are illustrated by a controlled random walk. (c) 2007 Elsevier Inc. All rights reserved.
机译:在本文中,我们研究了可数离散时间马尔可夫决策过程的偏差最优性和强n(n = -1,0)折扣最优性。奖励可能没有上限,也没有下限。我们对系统的原始数据给出了充分的条件,并在此基础上证明了(1)偏倚最优性方程和偏倚最优策略的存在; (2)等同于偏向最优政策的条件; (3)平均预期报酬最优性和强-1折扣最优性相等; (4)偏置最优和强0折扣最优是等效的; (5)强n(n = -1,0)折扣最优平稳策略的存在。我们的条件比以前的文献要弱。此外,我们的结果通过受控的随机游走来说明。 (c)2007 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号