A COMPROMISE PROGRAMMING APPROACH TO MULTIOBJECTIVE MARKOV DECISION PROCESSES

WLODZIMIERZ OGRYCZAK; PATRICE PERNY; PAUL WENG

首页> 外文期刊>International Journal of Information Technology & Decision Making >A COMPROMISE PROGRAMMING APPROACH TO MULTIOBJECTIVE MARKOV DECISION PROCESSES

【24h】

A COMPROMISE PROGRAMMING APPROACH TO MULTIOBJECTIVE MARKOV DECISION PROCESSES

机译：多目标马尔可夫决策过程的一种妥协编程方法

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A Markov decision process (MDP) is a general model for solving planning problems under uncertainty. It has been extended to multiobjective MDP to address multicriteria or multiagent problems in which the value of a decision must be evaluated according to several viewpoints, sometimes conflicting. Although most of the studies concentrate on the determination of the set of Pareto-optimal policies, we focus here on a more specialized problem that concerns the direct determination of policies achieving well-balanced tradeoffs. To this end, we introduce a reference point method based on the optimization of a weighted ordered weighted average (WOWA) of individual disachievements. We show that the resulting notion of optimal policy does not satisfy the Bellman principle and depends on the initial state. To overcome these difficulties, we propose a solution method based on a linear programming (LP) reformulation of the problem. Finally, we illustrate the feasibility of the proposed method on two types of planning problems under uncertainty arising in navigation of an autonomous agent and in inventory management.

机译：马尔可夫决策过程（MDP）是解决不确定性下计划问题的通用模型。它已扩展到多目标MDP，以解决多准则或多主体问题，其中决策的价值必须根据几种观点进行评估，有时会相互冲突。尽管大多数研究都集中在确定一组帕累托最优政策上，但我们这里集中在一个更专业的问题上，该问题涉及直接确定实现均衡平衡的政策。为此，我们介绍了一种基于参考值的方法，该方法基于单个成就的加权有序加权平均值（WOWA）的优化。我们证明最优政策的结果概念不满足Bellman原理，而是取决于初始状态。为了克服这些困难，我们提出了一种基于线性规划（LP）重新编写问题的解决方法。最后，我们说明了在自主代理商导航和库存管理中出现不确定性的情况下，该方法对两种类型的计划问题的可行性。

著录项

来源
《International Journal of Information Technology & Decision Making 》 |2013年第5期| 共33页
作者
WLODZIMIERZ OGRYCZAK; PATRICE PERNY; PAUL WENG;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类信息与传播理论 ;
关键词
Multiobjective optimization; Markov decision processes; Compromise programming; Reference point method; Ordered weighted average; Linear programming;

机译：多目标优化;马尔可夫决策过程;折衷规划;参考点法;有序加权平均值;线性规划;

相似文献

外文文献
中文文献
专利

1. A COMPROMISE PROGRAMMING APPROACH TO MULTIOBJECTIVE MARKOV DECISION PROCESSES [J] . WLODZIMIERZ OGRYCZAK, PATRICE PERNY, PAUL WENG International Journal of Information Technology & Decision Making . 2013 ,第5期

机译：多目标马尔可夫决策过程的一种妥协编程方法
2. A CONVEX PROGRAMMING APPROACH FOR DISCRETE-TIME MARKOV DECISION PROCESSES UNDER THE EXPECTED TOTAL REWARD CRITERION [J] . Dufour Francois, Genadot Alexandre SIAM Journal on Control and Optimization . 2020 ,第4期

机译：预期总奖励标准下的离散时间马尔可夫决策过程的凸编程方法
3. A LINEAR PROGRAMMING BASED APPROACH FOR COMPOSITE-ACTION MARKOV DECISION PROCESSES [J] . Zhang Zhicong, Li Shuai, Yan Xiaohui, RAIRO Operation Research . 2019 ,第5期

机译：基于线性规划的复合动作马尔可夫决策过程
4. On Finding Compromise Solutions in Multiobjective Markov Decision Processes [C] . Patrice Perny, Paul Weng European Conference on Artificial Intelligence . 2010

机译：论多目标马尔可夫决策过程中的妥协解决方案
5. Multistage decisions and risk in Markov decision processes: Towards effective approximate dynamic programming architectures. [D] . Pratikakis, Nikolaos E. 2009

机译：马尔可夫决策过程中的多阶段决策和风险：建立有效的近似动态编程体系结构。
6. Composition of Web Services Using Markov Decision Processes and Dynamic Programming [O] . Víctor Uc-Cetina, Francisco Moo-Mena, Rafael Hernandez-Ucan 2015

机译：使用Markov决策过程和动态规划的Web服务组合
7. A Convex Programming Approach for Discrete-Time Markov Decision Processes under the Expected Total Reward Criterion [O] . François Dufour, Alexandre Genadot 2020

机译：在预期总奖励标准下的离散时间马尔可夫决策过程的凸编程方法

A COMPROMISE PROGRAMMING APPROACH TO MULTIOBJECTIVE MARKOV DECISION PROCESSES

摘要

著录项

相似文献

相关主题

期刊订阅