From 'Think Like a Vertex' to 'Think Like a Graph'

机译：从“像顶点一样思考”到“像图一样思考”

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

To meet the challenge of processing rapidly growing graph and network data created by modern applications, a number of distributed graph processing systems have emerged, such as Pregel and GraphLab. All these systems divide input graphs into partitions, and employ a "think like a vertex" programming model to support iterative graph computation. This vertex-centric model is easy to program and has been proved useful for many graph algorithms. However, this model hides the partitioning information from the users, thus prevents many algorithm-specific optimizations. This often results in longer execution time due to excessive network messages (e.g. in Pregel) or heavy scheduling overhead to ensure data consistency (e.g. in GraphLab). To address this limitation, we propose a new "think like a graph" programming paradigm. Under this graph-centric model, the partition structure is opened up to the users, and can be utilized so that communication within a partition can bypass the heavy message passing or scheduling machinery. We implemented this model in a new system, called Giraph-H-, based on Apache Giraph, an open source implementation of Pregel. We explore the applicability of the graph-centric model to three categories of graph algorithms, and demonstrate its flexibility and superior performance, especially on well-partitioned data. For example, on a web graph with 118 million vertices and 855 million edges, the graph-centric version of connected component detection algorithm runs 63X faster and uses 204X fewer network messages than its vertex-centric counterpart.

机译：为了应对由现代应用程序创建的处理快速增长的图形和网络数据的挑战，出现了许多分布式图形处理系统，例如Pregel和GraphLab。所有这些系统将输入图划分为多个分区，并采用“像顶点一样的思维”编程模型来支持迭代图计算。这种以顶点为中心的模型易于编程，并且已被证明对许多图形算法有用。但是，此模型向用户隐藏了分区信息，因此阻止了许多特定于算法的优化。由于过多的网络消息（例如在Pregel中）或沉重的调度开销（以确保数据一致性）（例如在GraphLab中），这通常会导致执行时间更长。为了解决此限制，我们提出了一种新的“像图一样思考”的编程范例。在这种以图形为中心的模型下，分区结构向用户开放，并且可以被利用，以便分区内的通信可以绕开繁重的消息传递或调度机制。我们在基于Apache Giraph（Pregel的开源实现）的名为Giraph-H-的新系统中实现了该模型。我们探索了以图为中心的模型对三类图算法的适用性，并展示了它的灵活性和优越的性能，尤其是在划分良好的数据上。例如，在具有1.18亿个顶点和8.55亿条边的网络图上，以图为中心的版本的连接组件检测算法比以顶点为中心的版本运行速度快63倍，使用的网络消息少204倍。

著录项

来源
《International conference on very large data bases》|2014年|193-204|共12页
会议地点
作者
Yuanyuan Tian; Andrey Balmin; Severin Andreas Corsten; Shirish Tatikonda; John McPherson;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. VERTEX DECOMPOSITIONS OF SPARSE GRAPHS INTO AN INDEPENDENT VERTEX SET AND A SUBGRAPH OF MAXIMUM DEGREE AT MOST 1 [J] . O. V. Borodin, A. V. Kostochka Siberian Mathematical Journal . 2011,第5期

机译：稀疏图的顶点分解为独立的顶点集和最大程度的子图，最大值为1
2. Almost Every n-Vertex Graph is Determined by Its 3 log(2) n-Vertex Subgraphs [J] . Farhadian Ameneh International Journal of Foundations of Computer Science . 2020,第5期

机译：几乎每个n-顶点图都由其3 log（2）n-顶点子图决定
3. Graphs without five-vertex path and four-vertex cycle [J] . Cao Shujuan, Huang Shenwei Applied mathematics and computation . 2019,第期

机译：没有五个顶点路径和四个顶点周期的图表
4. Double Vertex Graphs and Complete Double Vertex Graphs [C] . Jobby Jacob, Wayne Goddard, Renu Laskar Southeastern International Conference on Combinatorics, Graph Theory and Computing . 2007

机译：双顶图和完整的双顶图形图
5. Subgraph Counting and Vertex Coloring in Large Graphs [D] . Bera, Suman Kalyan. 2019

机译：大图中的子图计数和顶点着色
6. Relating Vertex and Global Graph Entropy in Randomly Generated Graphs [O] . Philip Tee, George Parisis, Luc Berthouze, 2018

机译：在随机生成的图表中关联顶点和全局图熵
7. VERTEX DECOMPOSITIONS OF SPARSE GRAPHS INTO AN INDEPENDENT VERTEX SET AND A SUBGRAPH OF MAXIMUM DEGREE AT MOST 1 [O] . O. V. Borodin, A. V. Kostochka 2011

机译：将spaRsE图的VERTEX分解为独立的VERTEX集和最大程度的最大程度的子图
8. Minimum Weighted Coloring of Triangulated Graphs, with Application to Weighted Vertex Packing in Arbitrary Graphs. [R] . Balas, E., Xue, J. 1989

机译：三角图的最小加权着色，适用于任意图的加权顶点包装。

From 'Think Like a Vertex' to 'Think Like a Graph'

摘要

著录项

相似文献

相关主题

期刊订阅