首页> 外文会议>International Conference for High Performance Computing, Networking, Storage and Analysis >Data partitioning strategies for graph workloads on heterogeneous clusters
【24h】

Data partitioning strategies for graph workloads on heterogeneous clusters

机译:异构集群中图形工作负载的数据分区策略

获取原文

摘要

Large scale graph analytics are an important class of problem in the modern data center. However, while data centers are trending towards a large number of heterogeneous processing nodes, graph analytics frameworks still operate under the assumption of uniform compute resources. In this paper, we develop heterogeneity-aware data ingress strategies for graph analytics workloads using the popular PowerGraph framework. We illustrate how simple estimates of relative node computational throughput can guide heterogeneity-aware data partitioning algorithms to provide balanced graph cutting decisions. Our work enhances five online data ingress strategies from a variety of sources to optimize application execution for throughput differences in heterogeneous data centers. The proposed partitioning algorithms improve the runtime of several popular machine learning and data mining applications by as much as a 65% and on average by 32% as compared to the default, balanced partitioning approaches.
机译:大规模图分析是现代数据中心的重要问题。但是,虽然数据中心正在趋向于大量异构处理节点,但是图形分析框架仍然在统一计算资源的假设下运行。在本文中,我们使用流行的PowerGraph框架开发了图形分析工作负载的异质性感知数据入口策略。我们说明了相对节点计算吞吐量的简单估计如何引导异质性感知数据分区算法以提供平衡的图形切割决策。我们的工作提高了来自各种来源的五个在线数据入口策略,以优化应用程序执行以实现异构数据中心的吞吐量差异。与默认平衡分区方法相比,所提出的分区算法通过多达65%和平均达到32%的多个流行机器学习和数据挖掘应用程序的运行时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号