首页> 外文会议>IEEE international conference on data engineering >Pagrol: Parallel graph olap over large-scale attributed graphs
【24h】

Pagrol: Parallel graph olap over large-scale attributed graphs

机译:PAGROL:平行图OLAP在大规模归属图中

获取原文

摘要

Attributed graphs are becoming important tools for modeling information networks, such as the Web and various social networks (e.g. Facebook, LinkedIn, Twitter). However, it is computationally challenging to manage and analyze attributed graphs to support effective decision making. In this paper, we propose, Pagrol, a parallel graph OLAP (Online Analytical Processing) system over attributed graphs. In particular, Pagrol introduces a new conceptual Hyper Graph Cube model (which is an attributed-graph analogue of the data cube model for relational DBMS) to aggregate attributed graphs at different granularities and levels. The proposed model supports different queries as well as a new set of graph OLAP Roll-Up/Drill-Down operations. Furthermore, on the basis of Hyper Graph Cube, Pagrol provides an efficient MapReduce-based parallel graph cubing algorithm, MRGraph-Cubing, to compute the graph cube for an attributed graph. Pagrol employs numerous optimization techniques: (a) a self-contained join strategy to minimize I/O cost; (b) a scheme that groups cuboids into batches so as to minimize redundant computations; (c) a cost-based scheme to allocate the batches into bags (each with a small number of batches); and (d) an efficient scheme to process a bag using a single MapReduce job. Results of extensive experimental studies using both real Facebook and synthetic datasets on a 128-node cluster show that Pagrol is effective, efficient and scalable.
机译:归属图正在成为建模信息网络的重要工具,例如网络和各种社交网络(例如Facebook,LinkedIn,Twitter)。然而,它正在计算和分析归属图来支持有效决策的挑战。在本文中,我们提出了Plagrol,并行图表OLAP(在线分析处理)系统通过归因图。特别地,PAGROL引入了一种新的概念超图形立方模型(它是关系DBMS的数据多维数据集模型的属性图模拟),以聚合不同粒度和水平的归属图。该建议的模型支持不同的查询以及一组新的图表OLAP卷起/钻取操作。此外,在Hyper图形立方体的基础上,PAGROL提供了一种有效的基于MAPRADUCE的并行图形立方算法,MRGraph-uciping,以计算归属图的图形多维数据集。 PAGROL采用众多优化技术:(a)自包含的连接策略,以最大限度地减少I / O成本; (b)将立方体分为批次的方案,以便最小化冗余计算; (c)基于成本的方案,将批次分配给袋子(每个都有少量批次); (d)使用单个MapReduce作业处理袋子的有效方案。在128节点集群上使用真正的Facebook和合成数据集的广泛实验研究结果表明PAGROL是有效,高效且可扩展的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号