Bias optimality and strong n (n =-1, 0) discount optimality for Markov decision processes

Zhu QX

首页> 外文期刊>Journal of Mathematical Analysis and Applications >Bias optimality and strong n (n =-1, 0) discount optimality for Markov decision processes

【24h】

Bias optimality and strong n (n =-1, 0) discount optimality for Markov decision processes

机译：马尔可夫决策过程的偏最优性和强n（n = -1，0）折扣最优性

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we study both bias optimality and strong n (n = -1,0) discount optimality for denumerable discrete-time Markov decision processes. The rewards may have neither upper not lower bounds. We give sufficient conditions on the system's primitive data, and under which we prove (1) the existence of the bias optimality equation and bias optimal policies; (2) a condition equivalent to bias optimal policies; (3) average expected reward optintality and strong -1-discount optirnality are equivalent; (4) bias optimality and strong 0-discount optirnality are equivalent; (5) the existence of strong n (n = -1,0) discount optimal stationary policies. Our conditions are weaker than those in the previous literature. Moreover, our results are illustrated by a controlled random walk. (c) 2007 Elsevier Inc. All rights reserved.

机译：在本文中，我们研究了可数离散时间马尔可夫决策过程的偏差最优性和强n（n = -1,0）折扣最优性。奖励可能没有上限，也没有下限。我们对系统的原始数据给出了充分的条件，并在此基础上证明了（1）偏倚最优性方程和偏倚最优策略的存在; （2）等同于偏向最优政策的条件; （3）平均预期报酬最优性和强-1折扣最优性相等; （4）偏置最优和强0折扣最优是等效的; （5）强n（n = -1,0）折扣最优平稳策略的存在。我们的条件比以前的文献要弱。此外，我们的结果通过受控的随机游走来说明。（c）2007 Elsevier Inc.保留所有权利。

著录项

来源
《Journal of Mathematical Analysis and Applications 》 |2007年第1期| 共17页
作者
Zhu QX;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类数学 ;
关键词
discrete-time Markov decision process; average reward; bias optiniality; strong 0-discount optimality; optimal stationary policy; STATIONARY POLICIES; UNBOUNDED COSTS; POTENTIALS; CRITERIA; CHAINS; SET;

机译：离散时间马尔可夫决策过程;平均奖励;偏光最优性;强0折扣最优性;最优平稳策略;固定汇率政策;无边际成本;潜在价格;准则;链;设定;

相似文献

外文文献
中文文献
专利

1. Bias optimality and strong n (n =-1, 0) discount optimality for Markov decision processes [J] . Zhu QX Journal of Mathematical Analysis and Applications . 2007 ,第1期

机译：马尔可夫决策过程的偏最优性和强n（n = -1，0）折扣最优性
2. Bias Optimality versus Strong 0-Discount Optimality in Markov Control Processes with Unbounded Costs [J] . Nadine Hilgert, Onésimo Hernández-Lerma Acta Applicandae Mathematicae . 2003 ,第3期

机译：具有无限成本的马尔可夫控制过程中的偏差最优与强0折扣最优
3. STRONG N-DISCOUNT AND FINITE-HORIZON OPTIMALITY FOR CONTINUOUS-TIME MARKOV DECISION PROCESSES [J] . ZHU Quanxin, GUO Xianping 系统科学与复杂性：英文版 . 2014 ,第005期

机译：连续马尔可夫决策过程的强N折扣和有限水平最优性
4. Discounted Optimality for Continuous-Time Markov Decision Processes in Polish Spaces [C] . Xianping Guo . 2006

机译：波兰空间中连续时间马尔可夫决策过程的折扣最优性
5. Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design [D] . Mason, Jennifer Elizabeth 2012

机译：马尔可夫决策过程和近似动态规划方法进行最优处理设计
6. Optimal Information Collection Policies in a Markov Decision Process Framework [O] . Lauren E. Cipriano, Jeremy D. Goldhaber-Fiebert, Shan Liu, -1

机译：马尔可夫决策过程框架中的最佳信息收集策略
7. On the First Passage $g$-Mean-Variance Optimality for Discounted Continuous-Time Markov Decision Processes [O] . Guo X, Huang X, Zhang Y 2015

机译：贴现连续时间马尔可夫决策过程的第一遍$ g $-均值最优性
8. Two Short Notes on Markov Processes: I. A Test for Sub-Optimal Actions in Markovian Decision Problems. II. An Intrinsically Determined Markov Chain [R] . MacQueen, J. B. 1966

机译：关于马尔可夫过程的两个简短说明：I。马尔可夫决策问题中次优最优行动的检验。 II。本质上确定的马尔可夫链

Bias optimality and strong n (n =-1, 0) discount optimality for Markov decision processes

摘要

著录项

相似文献

相关主题

期刊订阅