首页> 外文会议>International Conference for High Performance Computing, Networking, Storage and Analysis >Data partitioning strategies for graph workloads on heterogeneous clusters
【24h】

Data partitioning strategies for graph workloads on heterogeneous clusters

机译:异构集群上图工作负载的数据分区策略

获取原文

摘要

Large scale graph analytics are an important class of problem in the modern data center. However, while data centers are trending towards a large number of heterogeneous processing nodes, graph analytics frameworks still operate under the assumption of uniform compute resources. In this paper, we develop heterogeneity-aware data ingress strategies for graph analytics workloads using the popular PowerGraph framework. We illustrate how simple estimates of relative node computational throughput can guide heterogeneity-aware data partitioning algorithms to provide balanced graph cutting decisions. Our work enhances five online data ingress strategies from a variety of sources to optimize application execution for throughput differences in heterogeneous data centers. The proposed partitioning algorithms improve the runtime of several popular machine learning and data mining applications by as much as a 65% and on average by 32% as compared to the default, balanced partitioning approaches.
机译:大规模图形分析是现代数据中心中重要的一类问题。但是,尽管数据中心趋向于大量的异构处理节点,但是图分析框架仍在统一计算资源的假设下运行。在本文中,我们使用流行的PowerGraph框架开发了用于图分析工作负载的异构感知数据入口策略。我们说明了相对节点计算吞吐量的简单估计如何可以指导异构感知的数据分区算法,以提供平衡的图形切割决策。我们的工作从各种来源增强了五种在线数据入口策略,以优化应用程序的执行,以解决异构数据中心中的吞吐量差异。与默认的平衡分区方法相比,提出的分区算法将几种流行的机器学习和数据挖掘应用程序的运行时间提高了65%,平均提高了32%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号