首页> 外文会议>International Conference on Data Engineering >An Efficient Probabilistic Approach for Graph Similarity Search
【24h】

An Efficient Probabilistic Approach for Graph Similarity Search

机译:图形相似度搜索有效的概率方法

获取原文

摘要

Graph similarity search is a common and fundamental operation in graph databases. One of the most popular graph similarity measures is the Graph Edit Distance (GED) mainly because of its broad applicability and high interpretability. Despite its prevalence, exact GED computation is proved to be NP-hard, which could result in unsatisfactory computational efficiency on large graphs. However, exactly accurate search results are usually unnecessary for real-world applications especially when the responsiveness is far more important than the accuracy. Thus, in this paper, we propose a novel probabilistic approach to efficiently estimate GED, which is further leveraged for the graph similarity search. Specifically, we first take branches as elementary structures in graphs, and introduce a novel graph similarity measure by comparing branches between graphs, i.e., Graph Branch Distance (GBD), which can be efficiently calculated in polynomial time. Then, we formulate the relationship between GED and GBD by considering branch variations as the result ascribed to graph edit operations, and model this process by probabilistic approaches. By applying our model, the GED between any two graphs can be efficiently estimated by their GBD, and these estimations are finally utilized in the graph similarity search. Extensive experiments show that our approach has better accuracy, efficiency and scalability than other comparable methods in the graph similarity search over real and synthetic data sets.
机译:图相似性搜索是图形数据库中的常见和基本操作。最流行的图形相似度措施之一是图表编辑距离(GED)主要是因为其广泛的适用性和高的可解释性。尽管它流行,但确切的GED计算被证明是NP-Hard,这可能导致大图中的计算效率不令人满意。然而,完全准确的搜索结果通常对于真实世界的应用通常是不必要的,特别是当响应性远远超过精度时更重要。因此,在本文中,我们提出了一种新的概率方法来有效地估计GED,这进一步利用了图形相似度搜索。具体地,我们首先将分支作为曲线图中的基本结构,并通过比较图之间的分支,即图形分支距离(GBD)之间的分支来引入新的图形相似度测量,这可以在多项式时间中有效地计算。然后,我们通过考虑分支变化作为赋予图表编辑操作的结果,通过概率方法模拟此过程来制定GED和GBD之间的关系。通过应用我们的模型,可以通过其GBD有效地估计任何两个图之间的GED,并且最终在图形相似度搜索中使用这些估计。广泛的实验表明,我们的方法具有比图形相似性在实际和合成数据集的其他类似方法中的更好的准确性,效率和可伸缩性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号