Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games

Markus N. Rabe; Sven Schewe

首页> 外文期刊>Acta Informatica >Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games

【24h】

Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games

机译：CTMDP和连续时间Markov游戏中有限可达性的有限最优控制

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We establish the existence of optimal scheduling strategies for time-bounded reachability in continuous-time Markov decision processes, and of co-optimal strategies for continuous-time Markov games. Furthermore, we show that optimal control does not only exist, but has a surprisingly simple structure: the optimal schedulers from our proofs are deterministic and timed positional, and the bounded time can be divided into a finite number of intervals, in which the optimal strategies are positional. That is, we demonstrate the existence of finite optimal control. Finally, we show that these pleasant properties of Markov decision processes extend to the more general class of continuous-time Markov games, and that both early and late schedulers show this behaviour.

机译：我们建立了连续时间马尔可夫决策过程中有时间限制的可到达性的最优调度策略，以及连续时间马尔可夫博弈的最优策略的存在。此外，我们证明了最优控制不仅存在，而且具有令人惊讶的简单结构：根据我们的证明，最优调度程序是确定性的和定时的位置，并且有界时间可以划分为有限数量的间隔，其中最优策略是位置。也就是说，我们证明了有限最优控制的存在。最后，我们证明了马尔可夫决策过程的这些令人愉悦的特性扩展到了更通用的连续时间马尔可夫博弈类，并且早期和晚期调度程序都显示了这种行为。

著录项

来源
《Acta Informatica》 |2011年第6期|p.291-315|共25页
作者
Markus N. Rabe; Sven Schewe;
展开▼
作者单位

Universitat des Saarlandes, Saarbriicken, Germany;

University of Liverpool, Liverpool, UK;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Optimal time-abstract schedulers for CTMDPs and continuous-time Markov games [J] . Markus N. Rabe, Sven Schewe Theoretical computer science . 2013,第Null期

机译：CTMDP和连续时间马尔可夫游戏的最佳时间抽象调度程序
2. Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes [J] . Christel Baier, Holger Hermanns, Joost-Pieter Katoen, Theoretical computer science . 2005,第1期

机译：统一连续时间马尔可夫决策过程中有界可及性概率的高效计算
3. Continuous-time stochastic games with time-bounded reachability [J] . Tomas Brazdil, Vojtech Forejt, Jan Krcal, Information and computation . 2013,第MARa期

机译：具有时间限制的连续时间随机游戏
4. Policy Learning for Time-Bounded Reachability in Continuous-Time Markov Decision Processes via Doubly-Stochastic Gradient Ascent [C] . Ezio Bartocci, Luca Bortolussi, Tomas Brazdil, International conference on quantitative evaluation of systems . 2016

机译：基于双随机梯度上升的连续时间马尔可夫决策过程中时间可及性的策略学习
5. DECENTRALIZED LEARNING IN GAMES AND FINITE MARKOV CHAINS (CONTROL, PROCESSES, SYSTEMS, THEORY). [D] . WHEELER, RICHARD MORGAN, JR. 1985

机译：游戏和有限马尔可夫链（控制，过程，系统，理论）中的分散学习。
6. SIMULATION FROM ENDPOINT-CONDITIONED CONTINUOUS-TIME MARKOV CHAINS ON A FINITE STATE SPACE WITH APPLICATIONS TO MOLECULAR EVOLUTION [O] . Asger Hobolth, Eric A. Stone -1

机译：动态模拟端点空调连续时间的马尔可夫链在有限状态空间应用程序分子进化
7. Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games [O] . Markus N. Rabe, Sven Schewe 2011

机译：CTMDPS和连续时间马罗瓦夫游戏中有限最佳控制
8. Zero-Sum Markov Games and Worst-Case Optimal Control of Queueing Systems [R] . Altman, E., Hordijk, A. 1994

机译：零和马尔可夫博弈与排队系统的最坏情况最优控制

Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games

摘要

著录项

相似文献

相关主题

期刊订阅