Weighted Discounted Markov Decision Processes with Perturbation

机译：扰动的加权折扣马尔可夫决策过程

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we consider the weighted reward discounted Markov Decision Processes or MDP's, for short, with perturbation. We give the proof of existence of an optimal simple ultimately deterministic policy for process Γ_0(β_1 • • • ,β_k). We also prove that there exists a δ-optimal simple ultimately deterministic policy in the perturbed weighted MDP, for all d ∈ [0,∈*). Finally we prove the following result: if χ is an optimal policy of Γ_d(β_1, • ••, β_K), then for any δ > 0 there exists an ε(δ)-neighbourhood B(D) such that when D_1∈ B(D), π is a δ-optimal policy of Γ_(D_1)(β_1, •••,β_K).

机译：在本文中，我们考虑带有扰动的加权奖励折现马尔可夫决策过程或MDP。我们给出了过程Γ_0（β_1•••，β_k）的最优简单最终确定性策略的存在性证明。我们还证明，对于所有d∈[0，∈*），在扰动的加权MDP中存在δ最优简单最终确定性策略。最后，我们证明以下结果：如果χ是Γ_d（β_1，•••，β_K）的最优策略，则对于任何δ> 0，都存在一个ε（δ）邻域B（D），使得当D_1∈B时（D），π是Γ_（D_1）（β_1，•••，β_K）的δ最优策略。

著录项

来源
《Operations research and its applications》|1995年|350-359|共10页
会议地点 Beijing(CN);Beijing(CN);Beijing(CN)
作者
Ke Liu;
展开▼
作者单位

University of SA, The Levels, SA 5095, Australia;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类工程基础科学;
关键词
入库时间 2022-08-26 14:06:04

相似文献

外文文献
中文文献
专利

1. Weighted discounted markov decision processes with perturbation [J] . Liu Ke Acta Mathematicae Applicatae Sinica . 1999,第2期

机译：扰动的加权折扣马尔可夫决策过程
2. Hierarchical algorithms for discounted and weighted Markov decision processes [J] . M. Abbad, C. Daoui Mathematical methods of operations research . 2003,第2期

机译：折现和加权马尔可夫决策过程的层次算法
3. Semi-infinite weighted Markov decision processes with perturbation [J] . Abbad M, Rahhalli K Mathematical methods of operations research . 2004,第2期

机译：具有摄动的半无限加权马尔可夫决策过程
4. Weighted Discounted Markov Decision Processes with Perturbation [C] . Ke Liu International symposium on operations research and its applications . 1995

机译：加权折扣马尔可夫决策过程与扰动
5. Modern Methods of Hidden Markov Models and Partially Observable Markov Decision Processes in Biostatistics [D] . Xu, Zekun. 2020

机译：隐藏马尔可夫模型的现代方法和止痛性的部分可观察马尔可夫决策过程
6. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes [O] . Rajesh P. N. Rao 2010

机译：不确定性下的决策：基于部分可观察的马尔可夫决策过程的神经模型
7. Weighted Difference Approximation of Value Functions for Slow-Discounting Markov Decision Processes [O] . Chow, Yin-Lam, Qin, Junjie 2014

机译：函数值函数的加权差分近似缓慢贴现的马尔可夫决策过程

Weighted Discounted Markov Decision Processes with Perturbation

摘要

著录项

相似文献

相关主题

期刊订阅