首页> 美国政府科技报告 >The Shift-Function Approach for Markov Decision Processes with Unbounded Returns

【24h】

The Shift-Function Approach for Markov Decision Processes with Unbounded Returns

机译：具有无界收益的马尔可夫决策过程的移位函数方法

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We study a discrete-time Markov decision process with general state and action space. The objective is to maximize the expected total return over a finite or infinite horizon. The transition probability measure is allowed to be defective, so that the model includes discounting, state-and action-dependent transition times (semi-Markov decision processes), and stopping problems. With applications to control of queues and inventory systems as a motivation, we develop a set of conditions on the one-period return function, the transition probabilities and the terminal value function that guarantee uniform convergence (with respect to the sup norm) of the finite-horizon optimal value functions to the infinite-horizon optimal value function (successive approximations). These conditions are substantially weaker and more realistic for the applications we have in mind than those of the classical, discounted bounded model. (Author)

著录项

作者
Stidham, S.; van Nunen, J.;
展开▼
作者单位

展开▼
年度 1981
页码 p.1-57
总页数 57
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Markov processes; Decision making; Dynamic programming; Queueing theory; Control sequences; Inventory control; Convergence; Discrete distribution; Time dependence; Theorems;

机译：马尔可夫过程;决策;动态规划;排队论;控制序列;库存控制;收敛;离散分布;时间依赖;定理;

相似文献

外文文献
中文文献
专利

1. ON THE MINIMUM PAIR APPROACH FOR AVERAGE COST MARKOV DECISION PROCESSES WITH COUNTABLE DISCRETE ACTION SPACES AND STRICTLY UNBOUNDED COSTS [J] . SIAM Journal on Control and Optimization . 2020,第2期

机译：关于平均成本马尔可夫决策过程的最小对方法，可数离散行动空间和严格无限成本
2. Discounted continuous-time Markov decision processes with unbounded rates and randomized history-dependent policies: the dynamic programming approach [J] . Alexey Piunovskiy, Yi Zhang 4OR: Quarterly Journal of the Belgian, French and Italian Operations Research Societies . 2014,第1期

机译：具有无限制利率和依赖历史的随机策略的折扣连续时间马尔科夫决策过程：动态规划方法
3. DISCOUNTED CONTINUOUS-TIME MARKOV DECISION PROCESSES WITH UNBOUNDED RATES: THE CONVEX ANALYTIC APPROACH [J] . ALEXEY PIUNOVSKIY, YI ZHANG SIAM Journal on Control and Optimization . 2011,第5期

机译：利率无限的连续马尔可夫决策过程：凸分析方法
4. Discovery of Optimal Solution Horizons in Non-Stationary Markov Decision Processes with Unbounded Rewards [C] . Grigory Neustroev, Mathijs de Weerdt, Remco Verzijlbergh International Conference on Automated Planning and Scheduling . 2019

机译：在非绑定奖励中发现非静止马尔可夫决策过程中最佳解决方案视野
5. Spectral properties for killed symmetric Markov processes with applications to Brownian motion in unbounded domains [D] . Matsuura Kouhei 2019

机译：被杀死的对称马尔可夫过程的谱性质及其在无界域中的布朗运动的应用
6. Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations [O] . Finale Doshi-Velez, George Konidaris -1

机译：隐参数马尔可夫决策过程：发现潜在任务参数化的半参数回归方法
7. Discounted Continuous-time Markov Decision Processes with Unbounded Rates: the Dynamic Programming Approach [O] . Piunovskiy, Alexey, Zhang, Yi 2011

机译：具有无界的折扣连续时间马尔可夫决策过程费率：动态规划方法
8. Shift-Function Approach for Markov Decision Processes with Unbounded Returns [R] . Stidham, S. , Van Nunen, J. 1981

机译：具有无界收益的马尔可夫决策过程的移位函数方法

The Shift-Function Approach for Markov Decision Processes with Unbounded Returns

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅