Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs)

机译：不断变化的多奖励部分可观察马尔可夫决策过程（MR-POMDPS）

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Plans and decisions in many real-world scenarios are made under uncertainty and to satisfy multiple, possibly conflicting, objectives. In this work, we contribute the multi-reward partially-observable Markov decision process (MR-POMDP) as a general modelling framework. To solve MR-POMDPs, we present two hybrid (memetic) multi-objective evolutionary algorithms that generate non-dominated sets of policies (in the form of stochastic finite state controllers). Performance comparisons between the methods on multi-objective problems in robotics (with 2, 3 and 5 objectives), web-advertising (with 3, 4 and 5 objectives) and infectious disease control (with 3 objectives), revealed that memetic variants outperformed their original counterparts. We anticipate that the MR-POMDP along with multi-objective evolutionary solvers will prove useful in a variety of theoretical and real-world applications.

机译：在许多现实世界方案中的计划和决策是在不确定性的不确定性和满足多重，可能相互冲突的目标中的。在这项工作中，我们为一般建模框架提供了多奖励部分可观察的马尔可夫决策过程（MR-POMDP）。为了解决MR-POMDPS，我们呈现了两个混合（麦片）多目标进化算法，产生非主导的政策组（以随机有限状态控制器的形式）。机器人中多目标问题的方法与2,3和5个目标）的性能比较，网络广告（具有3,4和5个目标）和传染病控制（具有3个目标），揭示了膜变体表现优于其原始对应物。我们预计MR-POMDP以及多目标进化求解器将在各种理论和现实世界应用中证明是有用的。

著录项

来源
《Annual conference on genetic and evolutionary computation》|2011年||共8页
会议地点
作者
Harold Soh; Yiannis Demiris;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类遗传学;
关键词
algorithms;

机译：算法;

相似文献

外文文献
中文文献
专利

1. Optimum inspection and maintenance policies for corroded structures using partially observable Markov decision processes and stochastic, physically based models [J] . K.G. Papakonstantinou, M. Shinozuka Probabilistic engineering mechanics . 2014,第jula期

机译：使用部分可观察的马尔可夫决策过程和基于物理的随机模型对腐蚀结构进行最佳检查和维护，
2. PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS [J] . JOHN GOULIONIS, D. STENGOS International Journal of Information Technology & Decision Making . 2011,第6期

机译：部分可观察的马尔可夫决策过程和周期性策略及其应用
3. PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS [J] . JOHN GOULIONIS∗† and D. STENGOS‡ International Journal of Information Technology & Decision Making . 2011,第6期

机译：可部分观察的马尔可夫决策过程和周期性策略及其应用
4. Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs) [C] . Harold Soh, Yiannis Demiris GECCO '11;Annual conference on genetic and evolutionary computation . 2012

机译：多奖励部分可观察的马尔可夫决策过程（MR-POMDP）的发展策略
5. Finite memory policies for partially observable Markov decision processes. [D] . Lusena, Christopher David. 2001

机译：用于部分可观察的马尔可夫决策过程的有限内存策略。
6. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes [O] . Rajesh P. N. Rao 2010

机译：不确定性下的决策：基于部分可观察的马尔可夫决策过程的神经模型
7. On The Average Cost Optimality Equation And The Structure Of Optimal Policies For Partially Observable Markov Decision Processes [O] . Emmanuel Fernández-Gaucherand, Aristotle Arapostathis, Steven I. Marcus 2007

机译：部分可观察的马尔可夫决策过程的平均成本最优方程和最优策略的结构

Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs)

摘要

著录项

相似文献

相关主题

期刊订阅