首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Cost-Aware Partitioning for Efficient Large Graph Processing in Geo-Distributed Datacenters
【24h】

Cost-Aware Partitioning for Efficient Large Graph Processing in Geo-Distributed Datacenters

机译:地理分布数据中心中用于高效大图处理的成本感知分区

获取原文
获取原文并翻译 | 示例

摘要

Graph processing is an emerging computation model for a wide range of applications and graph partitioning is important for optimizing the cost and performance of graph processing jobs. Recently, many graph applications store their data on geo-distributed datacenters (DCs) to provide services worldwide with low latency. This raises new challenges to existing graph partitioning methods, due to the multi-level heterogeneities in network bandwidth and communication prices in geo-distributed DCs. In this article, we propose an efficient graph partitioning method named Geo-Cut, which takes both the cost and performance objectives into consideration for large graph processing in geo-distributed DCs. Geo-Cut adopts two optimization stages. First, we propose a cost-aware streaming heuristic and utilize the one-pass streaming graph partitioning method to quickly assign edges to different DCs while minimizing inter-DC data communication cost. Second, we propose two partition refinement heuristics which identify the performance bottlenecks of geo-distributed graph processing and refine the partitioning result obtained in the first stage to reduce the inter-DC data transfer time while satisfying the budget constraint. Geo-Cut can be also applied to partition dynamic graphs thanks to its lightweight runtime overhead. We evaluate the effectiveness and efficiency of Geo-Cut using real-world graphs with both real geo-distributed DCs and simulations. Evaluation results show that Geo-Cut can reduce the inter-DC data transfer time by up to 79 percent (42 percent as the median) and reduce the monetary cost by up to 75 percent (26 percent as the median) compared to state-of-the-art graph partitioning methods with a low overhead.
机译:图形处理是一种新兴的计算模型,适用于广泛的应用程序,图形分区对于优化图形处理作业的成本和性能非常重要。最近,许多图形应用程序将其数据存储在地理分布的数据中心(DC)上,以提供低延迟的全球服务。由于地理带宽DC中网络带宽和通信价格的多层次异质性,这给现有的图形分区方法提出了新的挑战。在本文中,我们提出了一种名为Geo-Cut的有效图分区方法,该方法将成本和性能目标都考虑到了地理分布DC中的大型图处理中。 Geo-Cut采用两个优化阶段。首先,我们提出一种成本意识的流启发式方法,并利用单程流图分区方法快速将边分配给不同的DC,同时将DC间的数据通信成本降至最低。其次,我们提出了两种分区细化启发式方法,它们确定了地理分布图处理的性能瓶颈,并对第一阶段中获得的分区结果进行细化,以减少DC间数据传输时间,同时满足预算约束。由于它的轻量级运行时开销,Geo-Cut也可以应用于分区动态图。我们使用具有真实地理分布DC和模拟的真实世界图来评估Geo-Cut的有效性和效率。评估结果表明,与状态相比,Geo-Cut可以将DC间的数据传输时间缩短多达79%(中位数为42%),并将货币成本降低多达75%(中位数为26%)。低开销的最新图形分区方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号