首页> 外文会议>IEEE Conference on Service-Oriented Computing and Applications >Distributed PathGraph: A Cluster Centric Framework for Distributed Processing Graph
【24h】

Distributed PathGraph: A Cluster Centric Framework for Distributed Processing Graph

机译:分布式路径图:用于分布式处理图的集群中心框架

获取原文

摘要

Large scale graph processing represents an interesting challenge due to the characteristics of the graph structure. Generally, a distributed graph processing framework is a better choice for large graphs with billions of edges. However, traditional iterative computation models like BSP under-perform due to large communication overheads and slow iterative convergence in a distributed environment. Here, we present DistPathGraph, a distributed graph processing framework based on PathGraph. First, considering the difference between single machine and cluster, we describe a novel cluster based partitioning method that is different from PathGraph. Second, due to dependence of vertices and consistency of data replicas in different partitions, we present a scheme to control the order of vertices in the updating procedure. Finally, we design a message-packing strategy that improves communication congestion and the rate of iterative convergence. Generally, synchronous communication model controls the computation and communication steps by barrier, which induces more CPU idle time and communication congestion. Although asynchronous model can eliminate CPU idle in some sense, it may lead to inconsistency of vertices and require complex control logic. Comparing to two models, our strategy is a better compromise. We evaluate our graph processing framework against GraphLab. The experimental results validate that our partitioning method and communication strategy improve performance, and that our graph processing framework outperforms GraphLab by up to 6.53X.
机译:由于图形结构的特性,大规模图形处理代表了一个有趣的挑战。通常,对于具有数十亿条边的大型图,分布式图处理框架是更好的选择。但是,传统的迭代计算模型(如BSP)由于通信开销大并且在分布式环境中的迭代收敛速度较慢,因此表现不佳。在这里,我们介绍DistPathGraph,这是一个基于PathGraph的分布式图形处理框架。首先,考虑到单机和集群之间的差异,我们描述了一种不同于PathGraph的基于集群的新颖分区方法。其次,由于顶点的依赖性以及不同分区中数据副本的一致性,我们提出了一种在更新过程中控制顶点顺序的方案。最后,我们设计了一种消息打包策略,以改善通信拥塞和迭代收敛速度。通常,同步通信模型通过屏障控制计算和通信步骤,这会导致更多的CPU空闲时间和通信拥塞。尽管异步模型可以从某种意义上消除CPU空闲,但是它可能导致顶点不一致并需要复杂的控制逻辑。与两个模型相比,我们的策略是一个更好的折衷方案。我们根据GraphLab评估我们的图形处理框架。实验结果证明,我们的分区方法和通信策略可提高性能,并且我们的图形处理框架性能比GraphLab高出6.53倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号