MEAN-VARIANCE OPTIMALITY FOR SEMI-MARKOV DECISION PROCESSES UNDER FIRST PASSAGE CRITERIA

Huang Xiangxiang; Huang Yonghui

首页> 外文期刊>Kybernetika >MEAN-VARIANCE OPTIMALITY FOR SEMI-MARKOV DECISION PROCESSES UNDER FIRST PASSAGE CRITERIA

【24h】

MEAN-VARIANCE OPTIMALITY FOR SEMI-MARKOV DECISION PROCESSES UNDER FIRST PASSAGE CRITERIA

机译：首次通过标准的半马尔可夫决策过程的均方差最优

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper deals with a first passage mean-variance problem for semi-Markov decision processes in Borel spaces. The goal is to minimize the variance of a total discounted reward up to the system's first entry to some target set, where the optimization is over a class of policies with a prescribed expected first passage reward. The reward rates are assumed to be possibly unbounded, while the discount factor may vary with states of the system and controls. We first develop some suitable conditions for the existence of first passage mean-variance optimal policies and provide a policy improvement algorithm for computing an optimal policy. Then, two examples are included to illustrate our results. At last, we show how the results here are reduced to the cases of discrete-time Markov decision processes and continuous-time Markov decision processes.

机译：本文讨论了Borel空间中半Markov决策过程的第一遍均值-方差问题。目标是最大程度地减少直到系统首次进入某个目标集为止的总折扣奖励的方差，在该目标集上，优化是针对具有规定的预期首次通过奖励的一类策略。假定奖励率可能不受限制，而折扣因数可能会随系统和控件的状态而变化。我们首先为首次通过均方差最优策略的存在开发了一些合适的条件，并提供了一种用于计算最优策略的策略改进算法。然后，包括两个示例以说明我们的结果。最后，我们展示了如何将结果简化为离散时间马尔可夫决策过程和连续时间马尔可夫决策过程的情况。

著录项

来源
《Kybernetika》 |2017年第1期|59-81|共23页
作者
Huang Xiangxiang; Huang Yonghui;
展开▼
作者单位

Dongguan Univ Technol, Sch Comp Sci & Network Secur, Dongguan 523000, Peoples R China;

Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
semi-Markov decision processes; first passage time; unbounded reward rate; minimal variance; mean-variance optimal policy;

机译：半马尔可夫决策过程;初次通过时间;无边际奖励率;最小方差;均方差最优策略;

相似文献

外文文献
中文文献
专利

1. Mean-variance optimality for semi-Markov decision processes under first passage criteria [J] . Xiangxiang?Huang, Yonghui?Huang Kybernetika . 2017,第1期

机译：初次通过条件下半马尔可夫决策过程的均值方差最优
2. Optimal risk probability for first passage models in semi-Markov decision processes [J] . Huang Y., Guo X. Journal of Mathematical Analysis and Applications . 2009,第1期

机译：半马尔可夫决策过程中首过模型的最佳风险概率
3. Mean-Variance Problems for Finite Horizon Semi-Markov Decision Processes [J] . Huang Yonghui, Guo Xianping Applied mathematics and optimization . 2015,第2期

机译：有限地平线半马尔可夫决策过程的均值问题
4. A unified approach for semi-Markov decision processes with discounted and average reward criteria [C] . Yanjie Li, Huijing Wang, Haoyao Chen World Congress on Intelligent Control and Automation . 2014

机译：具有折扣和平均奖励标准的半马尔可夫决策过程的统一方法
5. A New Reinforcement Learning Algorithm with Fixed Exploration for Semi-Markov Decision Processes [D] . Encapera, Angelo Michael. 2017

机译：半马尔可夫决策过程的固定探索新强化学习算法
6. Learning to maximize reward rate: a model based on semi-Markov decision processes [O] . Arash Khodadadi, Pegah Fakhari, Jerome R. Busemeyer 2014

机译：学习最大化奖励率：基于半马尔可夫决策过程的模型
7. On the First Passage $g$-Mean-Variance Optimality for Discounted Continuous-Time Markov Decision Processes [O] . Guo X, Huang X, Zhang Y 2015

机译：贴现连续时间马尔可夫决策过程的第一遍$ g $-均值最优性
8. Theory for Semi-Markov Decision Processes with Unbounded Costs and Its Application to the Optimal Control of Queueing Systems. [R] . Orkenyi, P. 1976

机译：无界成本半马尔可夫决策过程理论及其在排队系统最优控制中的应用。

MEAN-VARIANCE OPTIMALITY FOR SEMI-MARKOV DECISION PROCESSES UNDER FIRST PASSAGE CRITERIA

摘要

著录项

相似文献

相关主题

期刊订阅