Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes

K?etínsky Jan; K?etínská Zuzana; Chatterjee Krishnendu

首页> 外文期刊>Logical Methods in Computer Science >Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes

【24h】

Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes

机译：统一马尔可夫决策过程中多个均值支付目标的两种观点

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider Markov decision processes (MDPs) with multiple limit-average (ormean-payoff) objectives. There exist two different views: (i) the expectationsemantics, where the goal is to optimize the expected mean-payoff objective,and (ii) the satisfaction semantics, where the goal is to maximize theprobability of runs such that the mean-payoff value stays above a given vector.We consider optimization with respect to both objectives at once, thus unifyingthe existing semantics. Precisely, the goal is to optimize the expectationwhile ensuring the satisfaction constraint. Our problem captures the notion ofoptimization with respect to strategies that are risk-averse (i.e., ensurecertain probabilistic guarantee). Our main results are as follows: First, wepresent algorithms for the decision problems which are always polynomial in thesize of the MDP. We also show that an approximation of the Pareto-curve can becomputed in time polynomial in the size of the MDP, and the approximationfactor, but exponential in the number of dimensions. Second, we present acomplete characterization of the strategy complexity (in terms of memory boundsand randomization) required to solve our problem.

机译：我们考虑具有多个极限平均（平均收益）目标的马尔可夫决策过程（MDP）。存在两种不同的观点：（i）期望语义，其目标是优化期望的均值收益目标;（ii）满意度语义，其目的是最大化运行概率，以使均值收益值保持不变高于给定向量。我们考虑同时针对两个目标进行优化，从而统一现有语义。精确地，目标是在确保满意度约束的同时优化期望。我们的问题涵盖了针对风险规避策略（即确保确定的概率保证）的优化概念。我们的主要结果如下：首先，我们提出了决策问题的算法，这些决策问题总是MDP大小的多项式。我们还表明，可以在时间多项式中以MDP的大小和近似因子来计算帕累托曲线的近似值，但是维数的指数则为指数。其次，我们提出解决问题所需的策略复杂性的完整特征（根据内存界限和随机性）。

著录项

来源
《Logical Methods in Computer Science》 |2017年第2期|共50页
作者
K?etínsky Jan; K?etínská Zuzana; Chatterjee Krishnendu;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
入库时间 2022-08-18 12:18:20

相似文献

外文文献
中文文献
专利

1. Symblicit algorithms for mean-payoff and shortest path in monotonic Markov decision processes [J] . Bohy Aaron, Bruyere Veronique, Raskin Jean-Francois, Acta Informatica . 2017,第6期

机译：单调马尔可夫决策过程中均值收益和最短路径的对称算法
2. Interval Markov Decision Processes with Multiple Objectives: From Robust Strategies to Pareto Curves [J] . Hahn Ernst Moritz, Hashemi Vahid, Hermanns Holger, ACM Transactions on Modeling and Computer Simulation . 2019,第4期

机译：具有多个目标的区间马尔可夫决策过程：从鲁棒策略到帕累托曲线
3. Markov Decision Processes with Multiple Long-run Average Objectives [J] . Tomá?Brázdil, VáclavBro?ek, KrishnenduChatterjee, Logical Methods in Computer Science . 2014,第1期

机译：具有多个长期平均目标的马尔可夫决策过程
4. Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes [C] . Br´zdil Tomas, Brozek V´clav, Chatterjee Krishnendu, 2011 26th Annual IEEE Symposium on Logic in Computer Science . 2011

机译：马尔可夫决策过程中多个均值支付目标的两种观点
5. Modern Methods of Hidden Markov Models and Partially Observable Markov Decision Processes in Biostatistics [D] . Xu, Zekun. 2020

机译：隐藏马尔可夫模型的现代方法和止痛性的部分可观察马尔可夫决策过程
6. Multi-Objective Markov Decision Processes for Data-Driven Decision Support [O] . Daniel J. Lizotte, Eric B. Laber -1

机译：数据驱动决策支持的多目标马尔可夫决策过程
7. Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes [O] . Chatterjee, Krishnendu, Křetínská, Zuzana, Křetínský, Jan 2017

机译：马尔可夫决策中多个均值支付目标的两种观点统一流程
8. Two Short Notes on Markov Processes: I. A Test for Sub-Optimal Actions in Markovian Decision Problems. II. An Intrinsically Determined Markov Chain [R] . MacQueen, J. B. 1966

机译：关于马尔可夫过程的两个简短说明：I。马尔可夫决策问题中次优最优行动的检验。 II。本质上确定的马尔可夫链

Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes

摘要

著录项

相似文献

相关主题

期刊订阅