首页> 外文会议>GECCO '11;Annual conference on genetic and evolutionary computation >Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs)
【24h】

Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs)

机译:多奖励部分可观察的马尔可夫决策过程(MR-POMDP)的发展策略

获取原文

摘要

Plans and decisions in many real-world scenarios are made under uncertainty and to satisfy multiple, possibly conflicting, objectives. In this work, we contribute the multi-reward partially-observable Markov decision process (MR-POMDP) as a general modelling framework. To solve MR-POMDPs, we present two hybrid (memetic) multi-objective evolutionary algorithms that generate non-dominated sets of policies (in the form of stochastic finite state controllers). Performance comparisons between the methods on multi-objective problems in robotics (with 2, 3 and 5 objectives), web-advertising (with 3, 4 and 5 objectives) and infectious disease control (with 3 objectives), revealed that memetic variants outperformed their original counterparts. We anticipate that the MR-POMDP along with multi-objective evolutionary solvers will prove useful in a variety of theoretical and real-world applications.
机译:在许多实际场景中,计划和决策都是在不确定的条件下制定的,并且要满足多个(可能是相互冲突的)目标。在这项工作中,我们将多奖励部分可观察的马尔可夫决策过程(MR-POMDP)作为通用建模框架做出了贡献。为了解决MR-POMDP,我们提出了两种混合(模因)多目标进化算法,它们生成非支配的策略集(以随机有限状态控制器的形式)。在机器人技术(具有2、3和5个目标),网络广告(具有3、4和5个目标)和传染病控制(具有3个目标)的多目标问题方法之间的性能比较表明,模因变异胜于它们原来的同行。我们预计MR-POMDP以及多目标进化求解器将在各种理论和实际应用中证明是有用的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号