Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs)

机译：多奖励部分可观察的马尔可夫决策过程（MR-POMDP）的发展策略

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Plans and decisions in many real-world scenarios are made under uncertainty and to satisfy multiple, possibly conflicting, objectives. In this work, we contribute the multi-reward partially-observable Markov decision process (MR-POMDP) as a general modelling framework. To solve MR-POMDPs, we present two hybrid (memetic) multi-objective evolutionary algorithms that generate non-dominated sets of policies (in the form of stochastic finite state controllers). Performance comparisons between the methods on multi-objective problems in robotics (with 2, 3 and 5 objectives), web-advertising (with 3, 4 and 5 objectives) and infectious disease control (with 3 objectives), revealed that memetic variants outperformed their original counterparts. We anticipate that the MR-POMDP along with multi-objective evolutionary solvers will prove useful in a variety of theoretical and real-world applications.

机译：在许多实际场景中，计划和决策都是在不确定的条件下制定的，并且要满足多个（可能是相互冲突的）目标。在这项工作中，我们将多奖励部分可观察的马尔可夫决策过程（MR-POMDP）作为通用建模框架做出了贡献。为了解决MR-POMDP，我们提出了两种混合（模因）多目标进化算法，它们生成非支配的策略集（以随机有限状态控制器的形式）。在机器人技术（具有2、3和5个目标），网络广告（具有3、4和5个目标）和传染病控制（具有3个目标）的多目标问题方法之间的性能比较表明，模因变异胜于它们原来的同行。我们预计MR-POMDP以及多目标进化求解器将在各种理论和实际应用中证明是有用的。

著录项

来源
《GECCO '11;Annual conference on genetic and evolutionary computation》|2012年|p.713-720|共8页
会议地点
作者
Harold Soh; Yiannis Demiris;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类遗传学;遗传学;计算技术、计算机技术;
关键词
algorithms;

机译：算法;

相似文献

外文文献
中文文献
专利

1. Optimum inspection and maintenance policies for corroded structures using partially observable Markov decision processes and stochastic, physically based models [J] . K.G. Papakonstantinou, M. Shinozuka Probabilistic engineering mechanics . 2014,第jula期

机译：使用部分可观察的马尔可夫决策过程和基于物理的随机模型对腐蚀结构进行最佳检查和维护，
2. PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS [J] . JOHN GOULIONIS, D. STENGOS International Journal of Information Technology & Decision Making . 2011,第6期

机译：部分可观察的马尔可夫决策过程和周期性策略及其应用
3. PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS [J] . JOHN GOULIONIS∗† and D. STENGOS‡ International Journal of Information Technology & Decision Making . 2011,第6期

机译：可部分观察的马尔可夫决策过程和周期性策略及其应用
4. Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs) [C] . Harold Soh, Yiannis Demiris Annual conference on genetic and evolutionary computation . 2011

机译：不断变化的多奖励部分可观察马尔可夫决策过程（MR-POMDPS）
5. Finite memory policies for partially observable Markov decision processes. [D] . Lusena, Christopher David. 2001

机译：用于部分可观察的马尔可夫决策过程的有限内存策略。
6. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes [O] . Rajesh P. N. Rao 2010

机译：不确定性下的决策：基于部分可观察的马尔可夫决策过程的神经模型
7. On The Average Cost Optimality Equation And The Structure Of Optimal Policies For Partially Observable Markov Decision Processes [O] . Emmanuel Fernández-Gaucherand, Aristotle Arapostathis, Steven I. Marcus 2007

机译：部分可观察的马尔可夫决策过程的平均成本最优方程和最优策略的结构

Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs)

摘要

著录项

相似文献

相关主题

期刊订阅