A New Perspective on the Tree Edit Distance

机译：树上的新透视图编辑距离

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The tree edit distance (TED), defined as the minimum-cost sequence of node operations that transform one tree into another, is a well-known distance measure for hierarchical data. Thanks to its intuitive definition, TED has found a wide range of diverse applications like software engineering, natural language processing, and bioinformatics. The state-of-the-art algorithms for TED recursively decompose the input trees into smaller subproblems and use dynamic programming to build the result in a bottom-up fashion. The main line of research deals with efficient implementations of a recursive solution introduced by Zhang in the late 1980s. Another more recent recursive solution by Chen found little attention. Its relation to the other TED solutions has never been studied and it has never been empirically tested against its competitors. In this paper we fill the gap and revisit Chen's TED algorithm. We analyse the recursion by Chen and compare it to Zhang's recursion. We show that all subproblems generated by Chen can also origin from Zhang's decomposition. This is interesting since new algorithms that combine the features of both recursive solutions could be developed. Moreover, we revise the runtime complexity of Chen's algorithm and develop a new traversal strategy to reduce its memory complexity. Finally, we provide the first experimental evaluation of Chen's algorithm and identify tree shapes for which Chen's solution is a promising competitor.

机译：树编辑距离（TED），定义为将一个树转换为另一个树的节点操作的最小成本序列，是分层数据的众所周知的距离测量。由于其直观的定义，TED已发现软件工程，自然语言处理和生物信息学等广泛的多样化应用。 TED的最先进的算法递归地将输入树分解为较小的子问题，并使用动态编程以自下而上的方式构建结果。在20世纪80年代后期，主要研究的主要研究涉及Zhang引入的递归解决方案的有效实施。陈某的另一个更新的递归解决方案发现很少关注。它与其他TED解决方案的关系从未被研究过，并且从未对其竞争对手进行了经验测试。在本文中，我们填补了差距和重新审视陈的TED算法。我们分析陈的递归，并将其与张的递归进行比较。我们表明，陈某生成的所有子问题也可以从张的分解起源。这很有趣，因为可以开发结合递归解决方案的功能的新算法。此外，我们修改了陈算法的运行时复杂性，并开发了一种新的遍历策略，以降低其内存复杂性。最后，我们提供了陈氏算法的第一个实验评估，并识别陈氏解决方案是一个有前途的竞争对手的树形。

著录项

来源
《International Conference on Similarity Search and Applications》|2017年|332p|共15页
会议地点
作者
Stefan Schwarz; Mateusz Pawlik; Nikolaus Augsten;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP301-53;
关键词

相似文献

外文文献
中文文献
专利

1. Approximating Tree Edit Distance through String Edit Distance for Binary Tree Codes [J] . Taku Aratsu, rnKouichi Hirata, rnTetsuji Kuboyama Fundamenta Informaticae . 2010,第3期

机译：通过二叉树代码的字符串编辑距离近似树编辑距离
2. Approximating Tree Edit Distance through String Edit Distance [J] . Tatsuya Akutsu, Daiji Fukagawa, Atsuhiro Takasu Algorithmica . 2010,第2期

机译：通过字符串编辑距离近似树的编辑距离
3. Approximating Tree Edit Distance through String Edit Distance [J] . Tatsuya Akutsu, Daiji Fukagawa, Atsuhiro Takasu Algorithmica . 2010,第2期

机译：通过字符串编辑距离近似树的编辑距离
4. Approximating Tree Edit Distance through String Edit Distance for Binary Tree Codes [C] . Taku Aratsu, Kouichi Hirata, Tetsuji Kuboyama SOFSEM 2009: Theory and practice of computer science . 2009

机译：通过二叉树代码的字符串编辑距离近似树编辑距离
5. Fast Edit Distance Calculation Methods for NGS Sequence Similarity [D] . Islam, A. K. M. Tauhidul. 2020

机译：NGS序列相似性快速编辑距离计算方法
6. Tree edit distance for leaf-labelled trees on free leafset and its comparison with frequent subsplit dissimilarity and popular distance measures [O] . Jakub Koperwas, Krzysztof Walczak 2011

机译：自由叶子集上带有叶子标签的树的树编辑距离以及与频繁亚分割差异和常用距离度量的比较
7. Approximating Tree Edit Distance through String Edit Distance [O] . Akutsu Tatsuya, Fukagawa Daiji, Takasu Atsuhiro 2010

机译：通过字符串编辑距离近似树的编辑距离

A New Perspective on the Tree Edit Distance

摘要

著录项

相似文献

相关主题

期刊订阅