首页> 美国政府科技报告 >Approximate Receding Horizon Approach for Markov Decision Processes: Average Award Case

【24h】

Approximate Receding Horizon Approach for Markov Decision Processes: Average Award Case

机译：马尔可夫决策过程的近似后退水平方法：平均奖励案例

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The authors consider an approximation scheme for solving Markov Decision Processes (MDPs) with countable state space, finite action space, and bounded rewards that uses an approximate solution of a fixed finite-horizon sub- MDP of a given infinite-horizon MDP to create a stationary policy, which they call 'approximate receding horizon control.' They first analyze the performance of the approximate receding horizon control for infinite-horizon average reward under an ergodicity assumption, which also generalizes the result obtained by White. The authors then study two examples of the approximate receding horizon control via lower bounds to the exact solution to the sub-MDP. The first control policy is based on a finite-horizon approximation of Howard's policy improvement of a single policy and the second policy is based on a generalization of the single policy improvement for multiple policies. They also provide a simple alternative proof on the policy improvement for countable state space. The authors discuss practical implementations of these schemes via simulation.

著录项

作者
Chang, H. S. ; Marcus, S. I.;
展开▼
作者单位

展开▼
年度 2002
页码 1-20
总页数 20
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Optimization; Decision making; Approximation(Mathematics); Problem solving; Markov processes; Stochastic control; Computerized simulation; Monte carlo method; Policies;

机译：优化;决策;逼近（数学）;问题解决;马尔可夫过程;随机控制;计算机模拟;蒙特卡罗方法;政策;

相似文献

外文文献
中文文献
专利

1. Approximate receding horizon approach for Markov decision processes: average reward case [J] . Chang HS., Marcus SI. Journal of Mathematical Analysis and Applications . 2003,第2期

机译：马尔可夫决策过程的近似后退地平线方法：平均奖励案例
2. A PERTURBATION APPROACH TO APPROXIMATE VALUE ITERATION FOR AVERAGE COST MARKOV DECISION PROCESSES WITH BOREL SPACES AND BOUNDED COSTS [J] . Vega-Amaya Oscar, Lopez-Borbon Joaqun Kybernetika . 2019,第1期

机译：具有BOREL空间和绑定成本的平均成本MARKOV决策过程的近似值迭代的扰动方法
3. A PERTURBATION APPROACH TO APPROXIMATE VALUE ITERATION FOR AVERAGE COST MARKOV DECISION PROCESSES WITH BOREL SPACES AND BOUNDED COSTS [J] . Vega-Amaya Oscar, Lopez-Borbon Joaqun Kybernetika . 2019,第1期

机译：具有Borel空间和界限成本的平均成本马尔可夫决策过程近似值迭代的扰动方法
4. Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes [C] . Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, International Conference on Machine Learning . 2021

机译：无限地平线平均奖励马尔可夫决策过程的无模型加强学习
5. Multistage decisions and risk in Markov decision processes: Towards effective approximate dynamic programming architectures. [D] . Pratikakis, Nikolaos E. 2009

机译：马尔可夫决策过程中的多阶段决策和风险：建立有效的近似动态编程体系结构。
6. Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations [O] . Finale Doshi-Velez, George Konidaris -1

机译：隐参数马尔可夫决策过程：发现潜在任务参数化的半参数回归方法
7. Approximate receding horizon approach for Markov decision processes: average reward case [O] . Chang Hyeong Soo, Marcus Steven I. 2003

机译：马尔可夫决策过程的近似后退地平线方法：平均奖励案例

Approximate Receding Horizon Approach for Markov Decision Processes: Average Award Case

摘要

著录项

相似文献

相关主题

期刊订阅