首页> 外文会议>International Conference on High Performance Computing >Accelerating inclusion-based pointer analysis on heterogeneous CPU-GPU systems
【24h】

Accelerating inclusion-based pointer analysis on heterogeneous CPU-GPU systems

机译:加快异构CPU-GPU系统上基于包含的指针分析

获取原文

摘要

This paper describes the first implementation of Andersen's inclusion-based pointer analysis for C programs on a heterogeneous CPU-GPU system, where both its CPU and GPU cores are used. As an important graph algorithm, Andersen's analysis is difficult to parallelise because it makes extensive modifications to the structure of the underlying graph, in a way that is highly input-dependent and statically hard to analyse. Existing parallel solutions run on either the CPU or GPU but not both, rendering the underlying computational resources underutilised and the ratios of CPU-only over GPU-only speedups for certain programs (i.e., graphs) unpredictable. We observe that a naive parallel solution of Andersen's analysis on a CPU-GPU system suffers from poor performance due to workload imbalance. We introduce a solution that is centered around a new dynamic workload distribution scheme. The novelty lies in prioritising the distribution of different types of workloads, i.e., graph-rewriting rules in Andersen's analysis to CPU or GPU according to the degrees of the processing unit's suitability for processing them. This scheme is effective when combined with synchronisation-free execution of tasks (i.e., graph-rewriting rules) and difference propagation of points-to information between the CPU and GPU. For a set of seven C benchmarks evaluated, our CPU-GPU solution outperforms (on average) (1) the CPU-only solution by 50.6%, (2) the GPU-only solution by 78.5%, and (3) an oracle solution that behaves as the faster of (1) and (2) on every benchmark by 34.6%.
机译:本文介绍了在异构CPU-GPU系统上同时使用C和GPU内核的Andersen基于C程序的指针包含分析的第一个实现。作为重要的图算法,Andersen的分析很难并行化,因为它以高度依赖输入且静态地难以分析的方式对基础图的结构进行了广泛的修改。现有的并行解决方案在CPU或GPU上运行,但不能同时在两者上运行,从而导致某些计算程序(即图形)的基础计算资源利用率不足,并且无法预测仅CPU与仅GPU的比率。我们发现,由于工作负载不平衡,Andersen在CPU-GPU系统上进行分析的天真并行解决方案性能较差。我们介绍一种围绕新的动态工作负载分配方案的解决方案。新颖之处在于优先处理不同类型的工作负载的分布,即根据处理器对处理工作的适合程度,将安徒生分析中的图形重写规则分配给CPU或GPU。当与免同步执行任务(即图形重写规则)以及点到信息在CPU和GPU之间的差异传播相结合时,此方案非常有效。对于一组经过评估的七个C基准测试,我们的CPU-GPU解决方案的性能(平均)优于(1)仅CPU解决方案50.6%,(2)仅GPU解决方案78.5%和(3)oracle解决方案在每个基准上的表现都比(1)和(2)快34.6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号