Interval Markov Decision Processes with Multiple Objectives: From Robust Strategies to Pareto Curves

Hahn Ernst Moritz; Hashemi Vahid; Hermanns Holger; Lahijanian Morteza; Turrini Andrea

首页> 外文期刊>ACM Transactions on Modeling and Computer Simulation >Interval Markov Decision Processes with Multiple Objectives: From Robust Strategies to Pareto Curves

【24h】

Interval Markov Decision Processes with Multiple Objectives: From Robust Strategies to Pareto Curves

机译：具有多个目标的区间马尔可夫决策过程：从鲁棒策略到帕累托曲线

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Accurate Modelling of a real-world system with probabilistic behaviour is a difficult task. Sensor noise and statistical estimations, among other imprecisions, make the exact probability values impossible to obtain. In this article, we consider Interval Markov decision processes (IMDPs), which generalise classical MDPs by having interval-valued transition probabilities. They provide a powerful modelling tool for probabilistic systems with an additional variation or uncertainty that prevents the knowledge of the exact transition probabilities. We investigate the problem of robust multi-objective synthesis for IMDPs and Pareto curve analysis of multi-objective queries on IMDPs. We study how to find a robust (randomised) strategy that satisfies multiple objectives involving rewards, reachability, and more general omega-regular properties against all possible resolutions of the transition probability uncertainties, as well as to generate an approximate Pareto curve providing an explicit view of the trade-offs between multiple objectives. We show that the multi-objective synthesis problem is PSPACE-hard and provide a value iteration-based decision algorithm to approximate the Pareto set of achievable points. We finally demonstrate the practical effectiveness of our proposed approaches by applying them on several case studies using a prototype tool.

机译：对具有概率行为的真实系统进行精确建模是一项艰巨的任务。传感器噪声和统计估计以及其他不确定性使得无法获得确切的概率值。在本文中，我们考虑了区间马尔可夫决策过程（IMDP），该过程通过具有间隔值的转移概率来概括经典MDP。它们为概率系统提供了强大的建模工具，并带有额外的变化或不确定性，从而无法了解确切的转移概率。我们研究了IMDP的鲁棒多目标综合问题以及IMDP上多目标查询的帕累托曲线分析。我们研究了如何找到一种健壮的（随机的）策略，该策略能够满足涉及转换概率不确定性的所有可能解决方案的涉及奖励，可及性和更一般的欧米茄常规性质的多个目标，以及如何生成提供清晰视图的近似帕累托曲线多个目标之间的权衡取舍。我们证明了多目标综合问题是PSPACE难题，并提供了一种基于值迭代的决策算法来近似可实现点的Pareto集。最后，我们通过使用原型工具将其应用于几个案例研究中，论证了我们提出的方法的实际有效性。

著录项

来源
《ACM Transactions on Modeling and Computer Simulation》 |2019年第4期|27.1-27.31|共31页
作者
Hahn Ernst Moritz; Hashemi Vahid; Hermanns Holger; Lahijanian Morteza; Turrini Andrea;
展开▼
作者单位

Queens Univ Belfast Sch Elect Elect Engn & Comp Sci Belfast Antrim North Ireland|Chinese Acad Sci State Key Lab Comp Sci Inst Software Beijing Peoples R China;

Audi AG Dept Informat Technol Ingolstadt Germany;

Saarland Univ Saarland Informat Campus Saarbrucken Germany|Inst Intelligent Software Guangzhou Guangdong Peoples R China;

Univ Colorado Dept Smead Aerosp Engn & Sci Boulder CO 80309 USA;

Chinese Acad Sci State Key Lab Comp Sci Inst Software Beijing Peoples R China|Inst Intelligent Software Guangzhou Guangdong Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Interval Markov decision processes; multi-objective optimisation; robust synthesis; Pareto curves; complexity;

机译：区间马尔可夫决策过程;多目标优化;稳健的综合;帕累托曲线复杂;

相似文献

外文文献
中文文献
专利

1. Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes [J] . K?etínsky Jan, K?etínská Zuzana, Chatterjee Krishnendu Logical Methods in Computer Science . 2017,第2期

机译：统一马尔可夫决策过程中多个均值支付目标的两种观点
2. Markov Decision Processes with Multiple Long-run Average Objectives [J] . Tomá?Brázdil, VáclavBro?ek, KrishnenduChatterjee, Logical Methods in Computer Science . 2014,第1期

机译：具有多个长期平均目标的马尔可夫决策过程
3. Multiple objective nonatomic Markov decision processes with total reward criteria [J] . Feinberg EA., Piunovskiy AB. Journal of Mathematical Analysis and Applications . 2000,第1期

机译：具有总奖励标准的多目标非原子马尔可夫决策过程
4. Multi-objective Robust Strategy Synthesis for Interval Markov Decision Processes [C] . Ernst Moritz Hahn, Vahid Hashemi, Holger Hermanns, International conference on quantitative evaluation of systems . 2017

机译：区间马尔可夫决策过程的多目标鲁棒策略综合
5. Concurrent Markov Decision Processes for Robust Robot Team Learning under Uncertainty. [D] . Girard, Justin. 2014

机译：不确定条件下鲁棒机器人团队学习的并行马尔可夫决策过程。
6. Evolving Robust Policy Coverage Sets in Multi-Objective Markov Decision Processes Through Intrinsically Motivated Self-Play [O] . Sherif Abdelfattah, Kathryn Kasmarik, Jiankun Hu 2018

机译：通过内在动机的自我博弈在多目标马尔可夫决策过程中发展稳健的政策覆盖范围
7. Multi-objective robust strategy synthesis for Interval Markov decision processes [O] . Hahn, EM, Hashemi, V, Hermanns, H, 2017

机译：区间马尔可夫决策过程的多目标鲁棒策略综合

Interval Markov Decision Processes with Multiple Objectives: From Robust Strategies to Pareto Curves

摘要

著录项

相似文献

相关主题

期刊订阅