首页> 外文OA文献 >Efficient processing of advanced structural queries in graph databases
【2h】

Efficient processing of advanced structural queries in graph databases

机译:在图数据库中高效处理高级结构查询

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Recent decades witnessed a rapid proliferation of graph data, such as chemicalstructures and business processes. Structural queries are frequently issued in thesedomains, and hence, attract extensive attention. Due to data inconsistency andnoise, a recent trend is to study similarity queries. Moreover, graphs may possessmultiple attributes on vertices, imposing additional challenges. Hence, indexingand processing methods for answering advanced structural queries are strongly indemand.The thesis studies three types of emerging structural queries in large-scale graphdatabases, including graph similarity join, graph similarity search and graph substructureselection.Graph similarity join retrieves all similarity pairs from two graph collectionssuch that their similarity satisfies a given threshold. Particularly, we incorporategraph edit distance as similarity measure. Inspired by the q-gram idea for stringsimilarity queries, we introduce path-based q-grams on graphs, and establish acount filtering lower bound for generating candidates. An efficient algorithm isproposed to exploit both matching and mismatching features. We also presentlabel and degree-based matching conditions to strengthen the filtering techniques.Leveraging offline index, we investigate efficient graph similarity search. Existingsolutions employ fixed-size overlapping substructures, and thus, are susceptible to large vertex degrees and thresholds. Disparately, we present a partition-basedapproach by dividing graphs into variable-size non-overlapping partitions. The editdistance constraint is converted to a containment constraint for candidate generation.A dynamic pruning technique and an improved verification algorithm arealso developed. In addition, a cost-aware graph partitioning technique is conceivedto optimize the index.Equipping structural queries with rich semantics, substructure selection overmulti-attributed graphs finds all subgraph isomorphic mappings such that each pairof matching vertices satisfy a set of selection conditions, each against an equality,range, or set containment operator on a vertex attribute. Existing techniquesfor single-labeled graphs are developed under the assumption of identical labelmatching, and thus, cannot handle the general cases. To this end, we proposea two-tier index via judiciously materializing feature mappings. We also devisea scalable indexing and query processing method exploiting execution plans withminimum cost.
机译:最近几十年见证了图形数据(例如化学结构和业务流程)的迅速增长。在这些领域中经常发出结构化查询,因此引起了广泛关注。由于数据的不一致和噪声,最近的趋势是研究相似性查询。此外,图可能在顶点上具有多个属性,从而带来了其他挑战。因此,迫切需要用于高级结构查询的索引和处理方法。本文研究了大型图数据库中三种新兴的结构查询,包括图相似连接,图相似搜索和图子结构选择。图相似连接从两个检索所有相似对图形集合,以使其相似度满足给定阈值。特别地,我们将图形编辑距离作为相似性度量。受用于字符串相似性查询的q-gram想法的启发,我们在图上引入了基于路径的q-gram,并建立了计数过滤下限以生成候选对象。提出了一种有效的算法来利用匹配和失配特征。我们还提出了基于标签和度的匹配条件,以增强过滤技术。利用离线索引,我们研究了有效的图相似度搜索。现有解决方案采用固定大小的重叠子结构,因此易受较大顶点度和阈值的影响。截然不同,我们通过将图划分为可变大小的非重叠分区来提出基于分区的方法。将编辑距离约束转换为包含约束以生成候选对象。还开发了动态修剪技术和改进的验证算法。此外,还构想了一种成本感知图分区技术来优化索引。通过使用具有丰富语义的结构查询,在多属性图上进行子结构选择会找到所有子图同构映射,从而使每对匹配的顶点都满足一组选择条件,每个条件都针对一个顶点属性上的等式,范围或集合包含运算符。单标签图的现有技术是在相同标签匹配的假设下开发的,因此无法处理一般情况。为此,我们通过明智地实现特征映射来提出两层索引。我们还设计了一种可扩展的索引和查询处理方法,以最小的成本利用执行计划。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号