首页> 外文期刊>Data & Knowledge Engineering >Query processing on large graphs: Approaches to scalability and response time trade offs
【24h】

Query processing on large graphs: Approaches to scalability and response time trade offs

机译:大型图上的查询处理:可伸缩性和响应时间折衷的方法

获取原文
获取原文并翻译 | 示例

摘要

Graphs, being an expressive data structure, have become increasingly important for modeling real-world applications, such as collaboration, different kinds of transactions, social networks, to name a few. With the advent of social networks and the web, the graph sizes have grown too large to fit in main memory precipitating the need for alternative approaches for an efficient, scalable evaluation of queries on graphs of any size.In this paper, we use the time-tested "divide and conquer" approach by partitioning a graph into desired number of partitions (and possibly with appropriate characteristics) and process queries over those partitions to obtain all or specified number of answers. This entails correctly computing answers that span multiple partitions or even need the same partition more than once. Given a set of partitions, there are a number of approaches using which a query can be evaluated: (i) One Partition At a Time (OPAT) approach, (ii) Traditional use of Multiple Processors (TraditionalMP), and (iii) using the Map/Reduce Multi-Processor approach (MapReduceMP) approach. The first approach, detailed in this paper, has established scalability through independent processing of partitions. The other two approaches address response time in addition to scalability. For the OPAT query evaluation approach, necessary minimal book keeping has been identified and its correctness established in this paper. Query answering on partitioned graphs also requires analyzing partitioning schemes for their impact on query processing and determining the number as well as the sequence in which partitions need to be loaded to reduce the response time for processing queries. We correlate query properties and partition characteristics to reduce query processing time in terms of the resources available.We also identify a set of quantitative metrics and use them for formulating heuristics to determine the order of loading partitions for efficient query processing. For OPAT approach, extensive experiments on large graphs (synthetic and real-world) using different partitioning schemes analyze the proposed heuristics on a variety of query types. The other two approaches are fleshed out, analyzed, and contrasted with the OPAT approach. An existing graph querying system has been extended to evaluate queries on partitioned graphs. Finally all three approaches are compared for their strengths and weaknesses.
机译:图形作为一种具有表现力的数据结构,对于建模现实应用程序(例如协作,各种交易,社交网络等)已变得越来越重要。随着社交网络和网络的出现,图的大小已经变得太大而无法容纳在主内存中,因此需要使用其他方法来对任何大小的图进行高效,可扩展的查询评估。在本文中,我们使用了时间经过测试的“分而治之”方法,将图形划分为所需数量的分区(并可能具有适当的特征),并处理这些分区上的查询以获得全部或指定数量的答案。这需要正确计算跨越多个分区甚至多次需要同一分区的答案。给定一组分区,可以使用多种方法来评估查询:(i)一次分区(OPAT)方法;(ii)传统使用多处理器(TraditionalMP);以及(iii)使用Map / Reduce多处理器方法(MapReduceMP)方法。本文详细介绍的第一种方法已通过独立处理分区建立了可伸缩性。除可伸缩性外,其他两种方法还解决了响应时间。对于OPAT查询评估方法,已确定了必要的最小簿记并确定了其正确性。对分区图的查询答复还需要分析分区方案对查询处理的影响,并确定需要加载分区的数量和顺序,以减少处理查询的响应时间。我们将查询属性和分区特性关联起来,以减少可用资源方面的查询处理时间。我们还确定了一组定量指标,并使用它们来制定启发式方法以确定加载分区的顺序以进行有效的查询处理。对于OPAT方法,使用不同的分区方案在大型图(合成图和实际图)上进行的大量实验分析了各种查询类型上的启发式算法。对其他两种方法进行了充实,分析,并与OPAT方法进行了对比。现有的图查询系统已扩展为评估分区图上的查询。最后,比较了这三种方法的优缺点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号