【24h】

Learning Heuristic Functions from Relaxed Plans

机译:从轻松的计划中学习启发式功能

获取原文
获取原文并翻译 | 示例

摘要

We present a novel approach to learning heuristic functions for AI planning domains. Given a state, we view a relaxed plan (RP) found from that state as a relational database, which includes the current state and goal facts, the actions in the RP, and the actions' add and delete lists. We represent heuristic functions as linear combinations of generic features of the database, selecting features and weights using training data from solved problems in the target planning domain. Many recent competitive planners use RP-based heuristics, but focus exclusively on the length of the RP, ignoring other RP features. Since RP construction ignores delete lists, for many domains, RP length dramatically under-estimates the distance to a goal, providing poor guidance. By using features that depend on deleted facts and other RP properties, our learned heuristics can potentially capture patterns that describe where such under-estimation occurs. Experiments in the STRIPS domains of IPC 3 and 4 show that best-first search using the learned heuristic can outperform FF (Hoffmann & Nebel 2001), which provided our training data, and frequently outperforms the top performances in IPC 4.
机译:我们提出了一种学习AI计划领域的启发式功能的新颖方法。给定一个状态,我们将从该状态中找到的宽松计划(RP)视为关系数据库,其中包括当前状态和目标事实,RP中的操作以及操作的添加和删除列表。我们将启发式函数表示为数据库通用特征的线性组合,使用训练数据从目标规划领域中解决的问题中选择特征和权重。许多近期的竞争性计划人员都使用基于RP的启发式方法,但只专注于RP的长度,而忽略了其他RP的功能。由于RP的构建会忽略删除列表,因此对于许多域来说,RP的长度大大低估了到目标的距离,提供了较差的指导。通过使用依赖于已删除事实和其他RP属性的功能,我们学到的启发式方法可以潜在地捕获描述这种低估发生位置的模式。在IPC 3和4的STRIPS域中进行的实验表明,使用学习的启发式方法进行的最佳优先搜索可以胜过FF(Hoffmann和Nebel,2001),后者提供了我们的训练数据,并且经常胜过IPC 4中的顶级性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号