A note on deterministic approximation of discounted Markov decision processes

Cruz-Suarez Hugo; Gordienko Evgueni; Montes-de-Oca Raul

首页> 外文期刊>Applied mathematics letters >A note on deterministic approximation of discounted Markov decision processes

【24h】

A note on deterministic approximation of discounted Markov decision processes

机译：关于折现马尔可夫决策过程的确定性逼近的注记

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study the approximation of a small-noise Markov decision process x(t) = F(x(t-1), a(t), xi(t)(epsilon)), t = 1, 2, ... by means of its deterministic counterpart: (x) over tilde (t) = F((x) over tilde (t-1), a(t), s(0)), t = 1, 2, ... where s(0) is a fixed point of the disturbance metric space (S, r). The total discounted cost is used as a criterion of optimality. Supposing that delta(epsilon) := Er(xi(1)(epsilon), s(0)) -> 0 as epsilon -> 0, we prove the convergence of optimal policies, estimate the rate of convergence of the optimal costs and give an upper bound (depending on delta(epsilon)) for the stability index. which measures the excess of the cost due to a replacement of the optimal policy by its deterministic approximation.

机译：我们研究小噪声马尔可夫决策过程的近似值x（t）= F（x（t-1），a（t），xi（t）（epsilon）），t = 1，2，...其确定性对应项的均值：（x）超过波浪号（t）= F（（x）超过波浪号（t-1），a（t），s（0）），t = 1，2，...，其中s （0）是干扰度量空间（S，r）的一个固定点。总折扣成本被用作最优标准。假设delta（epsilon）：= Er（xi（1）（epsilon），s（0））-> 0为epsilon-> 0，我们证明了最优策略的收敛性，估计了最优成本的收敛速度，并且给出稳定性指数的上限（取决于delta（epsilon））。它通过确定性近似来度量由于最优策略的替换而导致的成本超额。

著录项

来源
《Applied mathematics letters》 |2009年第8期|共5页
作者
Cruz-Suarez Hugo; Gordienko Evgueni; Montes-de-Oca Raul;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类应用数学;
关键词
Markov decision process; Total discounted cost; Deterministic approximation; Kantorovich metric; Rate of convergence;

机译：马尔可夫决策过程;总折现成本;确定性近似;Kantorovich度量;收敛速度;

相似文献

外文文献
中文文献
专利

1. A note on deterministic approximation of discounted Markov decision processes [J] . Cruz-Suarez Hugo, Gordienko Evgueni, Montes-de-Oca Raul Applied mathematics letters . 2009,第8期

机译：关于折现马尔可夫决策过程的确定性逼近的注记
2. Asymptotic Optimality of Finite Model Approximations for Partially Observed Markov Decision Processes With Discounted Cost [J] . Saldi Naci, Yuksel Serdar, Linder Tamas IEEE Transactions on Automatic Control . 2020,第1期

机译：有限模型近似的渐近最优折扣折扣判决过程的有限模型近似
3. Finite-State Approximations to Discounted and Average Cost Constrained Markov Decision Processes [J] . Naci Saldi IEEE Transactions on Automatic Control . 2019,第7期

机译：折扣和平均成本约束的马尔可夫决策过程的有限状态近似
4. Discounted deterministic Markov decision processes and discounted all-pairs shortest paths [C] . Omid Madani, Mikkel Thorup, Uri Zwick Annual ACM-SIAM Symposium on Discrete Algorithms;ACM-SIAM Symposium on Discrete Algorithms . 2009

机译：折衷的确定性马尔可夫决策过程和折衷的所有对最短路径
5. Linear approximations for factored Markov decision processes. [D] . Patrascu, Relu-Eugen. 2005

机译：因子马尔可夫决策过程的线性近似。
6. Approximation methods for piecewise deterministic Markov processes and their costs [O] . Peter Kritzer, Gunther Leobacher, Michaela Szölgyenyi, -1

机译：分段确定性马尔可夫过程的逼近方法及其成本
7. A note on deterministic approximation of discounted Markov decision processes [O] . Cruz-Suárez Hugo, Gordienko Evgueni, Montes-de-Oca Raúl 2009

机译：关于折现马尔可夫决策过程的确定性逼近的注记
8. Two Short Notes on Markov Processes: I. A Test for Sub-Optimal Actions in Markovian Decision Problems. II. An Intrinsically Determined Markov Chain [R] . MacQueen, J. B. 1966

机译：关于马尔可夫过程的两个简短说明：I。马尔可夫决策问题中次优最优行动的检验。 II。本质上确定的马尔可夫链

A note on deterministic approximation of discounted Markov decision processes

摘要

著录项

相似文献

相关主题

期刊订阅