首页>
外文OA文献
>A fine-grained XML structural comparison approach
【2h】
A fine-grained XML structural comparison approach
展开▼
机译:细粒度的XML结构比较方法
展开▼
免费
页面导航
摘要
著录项
相似文献
相关主题
摘要
As the Web continues to grow and evolve, more and more information is being placed in structurally rich documents, XML documents in particular, so as to improve the efficiency of similarity clustering, information retrieval and data management applications. Various algorithms for comparing hierarchically structured data, e.g., XML documents, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being modeled as ordered labeled trees. Nevertheless, a thorough investigation of current approaches led us to identify several structural similarity aspects, i.e. sub-tree related similarities, which are not sufficiently addressed while comparing XML documents. In this paper, we provide an improved comparison method to deal with fine-grained sub-trees and leaf node repetitions, without increasing overall complexity with respect to current XML comparison methods. Our approach consists of two main algorithms for discovering the structural commonality between sub-trees and computing tree-based edit operations costs. A prototype has been developed to evaluate the optimality and performance of our method. Experimental results, on both real and synthetic XML data, demonstrate better performance with respect to alternative XML comparison methods.
展开▼