【24h】

Connected Substructure Similarity Search

机译:连接的子结构相似性搜索

获取原文

摘要

Substructure similarity search is to retrieve graphs that approximately contain a given query graph. It has many applications, e.g., detecting similar functions among chemical compounds. The problem is challenging as even testing subgraph containment between two graphs is NP-complete. Hence, existing techniques adopt the filtering-and-verification framework with the focus on developing effective and efficient techniques to remove non-promising graphs. Nevertheless, existing filtering techniques may be still unable to effectively remove many "low" quality candidates. To resolve this, in this paper we propose a novel indexing technique, GrafD-Index, to index graphs according to their "distances" to features. We characterize a tight condition under which the distance-based triangular inequality holds. We then develop lower and upper bounding techniques that exploit the GrafD-Index to (1) prune non-promising graphs and (2) include graphs whose similarities are guaranteed to exceed the given similarity threshold. Considering that the verification phase is not well studied and plays the dominant role in the whole process, we devise efficient algorithms to verify candidates. A comprehensive experiment using real datasets demonstrates that our proposed methods significantly outperform existing methods.
机译:子结构相似性搜索是检索近似包含给定查询图的图表。它具有许多应用,例如,检测化学化合物之间的类似功能。甚至甚至测试两个图之间的子图电容都是挑战的挑战是NP-Complete。因此,现有技术采用过滤和验证框架,重点是开发有效和有效的技术以消除非承诺的图表。然而,现有的过滤技术可能仍然无法有效地消除许多“低”质量的候选者。要解决此问题,请在本文中提出了一种新颖的索引技术,Grafd-Index,根据其“距离”来索引图。我们的特征在于,距离的三角形不等式占据了紧张的条件。然后,我们开发出较低和上限的技术,用于利用Grafd指数到(1)修剪非承诺图表和(2)包括其相似度超过给定相似阈值的图形。考虑到验证阶段没有很好地研究并在整个过程中起着主导作用,我们设计了高效的算法来验证候选人。使用真实数据集的全面实验表明我们所提出的方法显着优于现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号