【24h】

Hyperpaths

机译:超路径

获取原文

摘要

Learning in multi-relational domains has gained in popularity over the past few years, contributing to applications in diverse areas. Typically, learning from multi-relational domains has involved learning rules about distinct entities so that they can be classified into one category or another. However, there are also interesting applications that are concerned with the problem of learning whether a number of entities are connected. Examples of these include determining whether two proteins interact in a cell, whether two identifiers are aliases, or whether a web page will refer another one; these are known as link mining [3].Inductive logic programming (ILP) systems, which often rely on hill-climbing heuristics in learning first-order concepts, have been a dominating force in the area of multi relational concept learning. However, hill-climbing heuristics are susceptible to local maxima and plateaus, which is especially a factor for large datasets where the branching factor per node can be verylarge [2, 1]. Ideally, saturation based search and a good scoring method should eventually lead us to the interesting clauses, however, the search space can grow so quickly that we risk never reaching an interesting path in a reasonable amount of time. This prompted us to consider alternative ways, such as pathfinding [4], to constrain the search space.Richards and Mooney realized that the problem of learning first-order concepts could be represented using graphs, and using the intuition that if two nodes interact there must exist an explanation, proposed that the explanation should be a connected path linking the two nodes. We agree with the idea and propose to use pathfinding on the saturated clause instead. The original pathfinding algorithm assumes the background knowledge forms an undirected graph. In contrast, the saturated clause is obtained by using mode declarations: in a nutshell, a literal can only be added to a clause if the literal's input variables are known to be bound. Mode declarations thus embed directionality in the graph formed by literals.We show how we can exploit the links between objects in multi-relational data to help a first-order rule learning system to direct the search by explicitly traversing these links to find paths between variables of interest. Specifically, we extend the pathfinding algorithm by Richards and Mooney [4] to make use of mode declarations to find paths in the saturated bottom clause, which anchor one end of the search space based on background knowledge.Our major insight is that a saturated clause for a moded program can be described as a directed hypergraph, which consists of nodes and hyperarcs that connect a nonempty set of nodes to one target node. Given this, we show that path finding can be reduced to reachability in the hypergraph, whereby each hyperpath will correspond to a hypothesis. However, we may be interested in non-minimal paths and in the composition of paths. We thus propose and evaluate an algorithm that can enumerate all such hyperpaths according to some heuristic and test it on the UW-CSE dataset by Richardson and Domingos [5]. Experimental results on a medium sized dataset show that path finding allows one to consider interesting clauses that would not easily be found by Aleph.
机译:在多关系领域学习技术日益普及,在过去几年,有利于在不同领域的应用。通常情况下,从多关系领域的学习大约有不同的实体参与学习规则,使他们可以归为一类或其他。然而,也有关心学习一些实体是否连接的问题,有趣的应用。这些的实例包括相互作用测定细胞是否两种蛋白质,两个标识符是否是别名,或网页是否将指另一个;这些公知为链接采​​矿[3] .Inductive逻辑编程(ILP)的系统,其通常依赖于在学习一阶概念爬坡启发式,已在多关系概念学习的区域中的主导力。然而,爬山启发式易受局部最大值和高原,这是尤其对于大的数据集,其中每个节点的分支因子可以是verylarge [2,1]的一个因素。理想的情况下,饱和基于搜索和良好的记分方法终将导致我们有趣的条款,但是,搜索空间可以如此迅速地增长,我们从未风险达到在合理时间内一个有趣的路径。这促使我们考虑替换方式,如寻路 [4],将搜索范围限制space.Richards和穆尼意识到学习一阶概念的问题可以用图来表示,并且使用所述直觉,如果交互必须存在一个解释两个节点,提出的解释应该是连接两个节点连接的路径。我们同意的想法,并建议使用寻路的饱和条款代替。最初的寻路算法假定的背景知识形成一个无向图。与此相反,饱和子句是通过使用获得的模式声明:概括地说,一个文字只能被添加到一个条款如果字面的输入变量是已知的约束。模式声明从而嵌入方向性由literals.We展示我们如何能够利用多关系数据对象之间的联系,以帮助一阶规则学习系统通过明确遍历这些链接查找变量之间的路径来直接搜索形成的图出于兴趣。具体而言,我们扩展了寻路算法由Richards和门尼[4]利用模式声明找到饱和底部子句中的路径,它锚定所述搜索的一端基于背景knowledge.Our主要见解空间是,对于一个moded程序饱和子句可以被描述为定向超图,它由一个非空组节点连接到一个目标节点的节点和hyperarcs的。鉴于此,我们证明了寻路可以在超图,其中每个超路径将对应于一个假设降至可达性。然而,我们可能感兴趣的非最小路径和在路径的组合物。因此,我们提出并评价了一种算法,可以枚举根据一些启发式所有这样hyperpaths和测试它由Richardson和多明戈斯[5] UW-CSE数据集。在介质上的实验结果数据集的大小显示,寻路让一个考虑,不会轻易被阿莱夫找到有趣的条款。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号