首页> 美国政府科技报告 >The Shift-Function Approach for Markov Decision Processes with Unbounded Returns
【24h】

The Shift-Function Approach for Markov Decision Processes with Unbounded Returns

机译:具有无界收益的马尔可夫决策过程的移位函数方法

获取原文

摘要

We study a discrete-time Markov decision process with general state and action space. The objective is to maximize the expected total return over a finite or infinite horizon. The transition probability measure is allowed to be defective, so that the model includes discounting, state-and action-dependent transition times (semi-Markov decision processes), and stopping problems. With applications to control of queues and inventory systems as a motivation, we develop a set of conditions on the one-period return function, the transition probabilities and the terminal value function that guarantee uniform convergence (with respect to the sup norm) of the finite-horizon optimal value functions to the infinite-horizon optimal value function (successive approximations). These conditions are substantially weaker and more realistic for the applications we have in mind than those of the classical, discounted bounded model. (Author)

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号