首页> 外文会议>IEEE International Conference on Big Data >Hybrid algorithms for subgraph pattern queries in graph databases
【24h】

Hybrid algorithms for subgraph pattern queries in graph databases

机译:图数据库中子图模式查询的混合算法

获取原文

摘要

Numerous methods have been proposed over the years for subgraph query processing, as it is central to graph analytics. Existing work is fragmented into two major categories. Methods in the filter-then-verify (FTV) category first construct an index of the DB graphs. Given a query, the index is used to filter out graphs that cannot contain the query. On the remaining graphs, a subgraph isomorphism algorithm is applied to verify whether each graph indeed contains the query. A second category of algorithms is mainly concerned with optimizing the Subgraph Isomorphism (SI) testing process (an NP-Complete problem) in order to find all occurrences of the query within each DB graph, also known as the matching problem. The current research trend is to totally dismiss FTV methods, because SI methods have been shown to enjoy much shorter query execution times and because of the alleged high costs of managing the DB graph index in FTV methods. Thus, a number of new SI methods are being proposed annually. In the current work, we initially study the performance of the latest SI algorithms over datasets consisting of a large number of graphs. With our study, we evaluate the algorithms' performance and we provide comparison details with former studies. As a second step, we combine the powerful filtering of a top-performing FTV method, with the various SI methods, which leads to the best practice conclusion that SI and FTV shouldn't be thought of as disjoint types of solutions, as their union achieves better results than any one of them individually. Specifically, we experimentally analyze and quantify the (positive) impact of including the essence of indexed FTV methods within SI methods, showing that query processing times can be significantly improved at modest additional memory costs. We show that these results hold over a variety of well-known SI methods and across several real and synthetic datasets. As such, hybrids of the type reveal a missing opportunity and a blind spot in related literature and trends.
机译:多年来,已经提出了许多用于子图查询处理的方法,因为它是图分析的核心。现有工作分为两大类。 “先过滤后验证”(FTV)类别中的方法首先构造DB图的索引。给定查询,索引用于过滤掉不能包含查询的图。在其余图上,应用子图同构算法来验证每个图是否确实包含查询。第二类算法主要与优化子图同构(SI)测试过程(NP完全问题)有关,以便找到每个DB图中所有出现的查询,也称为匹配问题。当前的研究趋势是完全取消FTV方法,因为已证明SI方法享有更短的查询执行时间,并且由于在FTV方法中管理DB图形索引的成本很高。因此,每年都提出许多新的SI方法。在当前的工作中,我们最初对包含大量图形的数据集研究最新的SI算法的性能。通过我们的研究,我们评估了算法的性能,并提供了与以前研究的比较细节。第二步,我们将性能最佳的FTV方法的强大过滤功能与各种SI方法结合在一起,从而得出最佳实践结论,即SI和FTV不应被视为解决方案的不相交类型,因为它们是联合的取得比其中任何一个单独的结果更好的结果。具体来说,我们通过实验方法分析和量化了(积极的)影响,其中包括将索引FTV方法的本质包含在SI方法中,这表明可以以适度的额外存储成本显着改善查询处理时间。我们表明,这些结果适用于多种众所周知的SI方法以及多个实际和综合数据集。因此,这种类型的混合动力汽车在相关文献和趋势中揭示了机会的缺失和盲点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号