首页> 外文期刊>Frontiers of computer science in China >Efficient graph similarity join for information integration on graphs
【24h】

Efficient graph similarity join for information integration on graphs

机译:高效的图相似性联接,用于图上的信息集成

获取原文
获取原文并翻译 | 示例
           

摘要

Graphs have been widely used for complex data representation in many real applications, such as social network, bioinformatics, and computer vision. Therefore, graph similarity join has become imperative for integrating noisy and inconsistent data from multiple data sources. The edit distance is commonly used to measure the similarity between graphs. The graph similarity join problem studied in this paper is based on graph edit distance constraints. To accelerate the similarity join based on graph edit distance, in the paper, we make use of a preprocessing strategy to remove the mismatching graph pairs with significant differences. Then a novel method of building indexes for each graph is proposed by grouping the nodes which can be reached in k hops for each key node with structure conservation, which is the k-hop tree based indexing method. As for each candidate pair, we propose a similarity computation algorithm with boundary filtering, which can be applied with good efficiency and effectiveness. Experiments on real and synthetic graph databases also confirm that our method can achieve good join quality in graph similarity join. Besides, the join process can be finished in polynomial time.
机译:图形已被广泛用于许多实际应用中的复杂数据表示,例如社交网络,生物信息学和计算机视觉。因此,图相似度联接对于集成来自多个数据源的嘈杂数据和不一致数据变得势在必行。编辑距离通常用于测量图形之间的相似度。本文研究的图相似连接问题是基于图编辑距离约束的。为了加快基于图编辑距离的相似性联接,在本文中,我们使用了一种预处理策略来删除具有显着差异的不匹配图对。然后提出了一种新的为每个图建立索引的方法,该方法是通过对每个关键节点的k跳中可以到达的节点进行分组,并保留结构,这是基于k跳树的索引方法。对于每个候选对,我们提出了一种具有边界滤波的相似度计算算法,该算法可以有效地应用。在实图和合成图数据库上进行的实验也证实了我们的方法可以在图相似连接中获得良好的连接质量。此外,连接过程可以在多项式时间内完成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号