First passage Markov decision processes with constraints and varying discount factors

Wu Xiao; Zou Xiaolong; Guo Xianping

首页> 外文期刊>Frontiers of mathematics in China >First passage Markov decision processes with constraints and varying discount factors

【24h】

First passage Markov decision processes with constraints and varying discount factors

机译：具有约束和可变折现率的第一代马尔可夫决策过程

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper focuses on the constrained optimality problem (COP) of first passage discrete-time Markov decision processes (DTMDPs) in denumerable state and compact Borel action spaces with multi-constraints, state-dependent discount factors, and possibly unbounded costs. By means of the properties of a so-called occupation measure of a policy, we show that the constrained optimality problem is equivalent to an (infinite-dimensional) linear programming on the set of occupation measures with some constraints, and thus prove the existence of an optimal policy under suitable conditions. Furthermore, using the equivalence between the constrained optimality problem and the linear programming, we obtain an exact form of an optimal policy for the case of finite states and actions. Finally, as an example, a controlled queueing system is given to illustrate our results.

机译：本文着重于可数状态和具有多约束，状态相关折扣因子以及可能无穷大成本的紧致Borel动作空间中的第一遍离散时间马尔可夫决策过程（DTMDP）的约束最优性问题（COP）。通过策略的所谓占用度量的性质，我们证明了约束最优性问题等同于带有约束的一组占用度量的（无限维）线性规划，从而证明了存在在适当条件下的最佳政策。此外，利用约束的最优性问题和线性规划之间的等价关系，我们获得了有限状态和作用情况下最优策略的精确形式。最后，以一个示例为例，给出了一个受控排队系统来说明我们的结果。

著录项

来源
《Frontiers of mathematics in China》 |2015年第4期|1005-1023|共19页
作者
Wu Xiao; Zou Xiaolong; Guo Xianping;
展开▼
作者单位

Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China|Zhaoqing Univ, Sch Math & Stat, Zhaoqing 526061, Peoples R China;

Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China;

Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Discrete-time Markov decision process (DTMDP); constrained optimality; varying discount factor; unbounded cost;

机译：离散时间马尔可夫决策过程（DTMDP）;约束最优性;可变折扣因子;无限制成本;
入库时间 2022-08-17 23:17:23

相似文献

外文文献
中文文献
专利

1. Finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors [J] . Wu Xiao, Zhang Junyu Discrete event dynamic systems: Theory and applications . 2016,第4期

机译：可变折扣因子的离散时间马尔可夫决策过程的第一遍模型的有限逼近
2. FIRST PASSAGE OPTIMALITY AND VARIANCE MINIMISATION OF MARKOV DECISION PROCESSES WITH VARYING DISCOUNT FACTORS [J] . Wu Xiao, Guo Xianping Journal of Applied Probability . 2015,第2期

机译：具有多种折扣因素的马尔可夫决策过程的第一通道最优性和方差最小化
3. First Passage Optimality for Continuous-Time Markov Decision Processes With Varying Discount Factors and History-Dependent Policies [J] . Guo X., Song X., Zhang Y. IEEE Transactions on Automatic Control . 2014,第1期

机译：可变折扣因子和历史相关策略的连续时间马尔可夫决策过程的第一遍最优性
4. An application to the finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors [C] . Xiao Wu, Junyu Zhang World Congress on Intelligent Control and Automation . 2014

机译：可变折扣因子的离散时间马尔可夫决策过程在第一阶段模型有限逼近中的应用
5. Linear approximations for factored Markov decision processes. [D] . Patrascu, Relu-Eugen. 2005

机译：因子马尔可夫决策过程的线性近似。
6. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes [O] . Rajesh P. N. Rao 2010

机译：不确定性下的决策：基于部分可观察的马尔可夫决策过程的神经模型
7. On the First Passage $g$-Mean-Variance Optimality for Discounted Continuous-Time Markov Decision Processes [O] . Guo X, Huang X, Zhang Y 2015

机译：贴现连续时间马尔可夫决策过程的第一遍$ g $-均值最优性

First passage Markov decision processes with constraints and varying discount factors

摘要

著录项

相似文献

相关主题

期刊订阅