首页> 外文会议>International Conference on Automated Planning and Scheduling(ICAPS 2006); 2006; >Stochastic Over-subscription Planning using Hierarchies of MDPs
【24h】

Stochastic Over-subscription Planning using Hierarchies of MDPs

机译:使用MDP层次结构的随机超额预订计划

获取原文
获取原文并翻译 | 示例

摘要

In over-subscription planning (OSP), the set of goals is not achievable jointly, and the task is to find a plan that attains the best feasible subset of goals given resource constraints. Recent classical OSP algorithms ignore the uncertainty inherent in many natural application domains where OSPs arise. And while modeling stochastic OSP problems as MDPs is easy, the resulting models are too large for standard solution approaches. Fortunately OSP problems have a natural two-tiered hierarchy, and in this paper we adapt and extend tools developed in the hierarchical reinforcement learning community in order to effectively exploit this hierarchy and obtain compact, factored policies. Typically, such policies are sub-optimal, but under certain assumptions that hold in our planetary exploration domain, our factored solution is, in fact, optimal. Our algorithms work by repeatedly solving a number of smaller MDPs, while propagating information between them. We evaluate a number of variants of this approach on a set of stochastic instances of a planetary rover domain, showing substantial performance gains.
机译:在超额预订计划(OSP)中,不能共同实现一组目标,而任务是找到一个在资源有限的情况下达到最佳可行目标子集的计划。最近的经典OSP算法忽略了出现OSP的许多自然应用领域固有的不确定性。尽管将随机OSP问题建模为MDP很容易,但对于标准解决方案方法而言,所得模型太大。幸运的是,OSP问题具有自然的两级层次结构,在本文中,我们适应并扩展了在层次强化学习社区中开发的工具,以便有效利用此层次结构并获得紧凑,分解的策略。通常,此类政策不是最优的,但在我们的行星勘探领域中采用的某些假设下,事实上,我们的分解式解决方案是最佳的。我们的算法通过反复求解许多较小的MDP,同时在它们之间传播信息来工作。我们在行星漫游车域的一组随机实例上评估了这种方法的多种变体,显示出可观的性能提升。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号