【24h】

A New Perspective on the Tree Edit Distance

机译:树上的新透视图编辑距离

获取原文

摘要

The tree edit distance (TED), defined as the minimum-cost sequence of node operations that transform one tree into another, is a well-known distance measure for hierarchical data. Thanks to its intuitive definition, TED has found a wide range of diverse applications like software engineering, natural language processing, and bioinformatics. The state-of-the-art algorithms for TED recursively decompose the input trees into smaller subproblems and use dynamic programming to build the result in a bottom-up fashion. The main line of research deals with efficient implementations of a recursive solution introduced by Zhang in the late 1980s. Another more recent recursive solution by Chen found little attention. Its relation to the other TED solutions has never been studied and it has never been empirically tested against its competitors. In this paper we fill the gap and revisit Chen's TED algorithm. We analyse the recursion by Chen and compare it to Zhang's recursion. We show that all subproblems generated by Chen can also origin from Zhang's decomposition. This is interesting since new algorithms that combine the features of both recursive solutions could be developed. Moreover, we revise the runtime complexity of Chen's algorithm and develop a new traversal strategy to reduce its memory complexity. Finally, we provide the first experimental evaluation of Chen's algorithm and identify tree shapes for which Chen's solution is a promising competitor.
机译:树编辑距离(TED),定义为将一个树转换为另一个树的节点操作的最小成本序列,是分层数据的众所周知的距离测量。由于其直观的定义,TED已发现软件工程,自然语言处理和生物信息学等广泛的多样化应用。 TED的最先进的算法递归地将输入树分解为较小的子问题,并使用动态编程以自下而上的方式构建结果。在20世纪80年代后期,主要研究的主要研究涉及Zhang引入的递归解决方案的有效实施。陈某的另一个更新的递归解决方案发现很少关注。它与其他TED解决方案的关系从未被研究过,并且从未对其竞争对手进行了经验测试。在本文中,我们填补了差距和重新审视陈的TED算法。我们分析陈的递归,并将其与张的递归进行比较。我们表明,陈某生成的所有子问题也可以从张的分解起源。这很有趣,因为可以开发结合递归解决方案的功能的新算法。此外,我们修改了陈算法的运行时复杂性,并开发了一种新的遍历策略,以降低其内存复杂性。最后,我们提供了陈氏算法的第一个实验评估,并识别陈氏解决方案是一个有前途的竞争对手的树形。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号