...
首页> 外文期刊>International journal of parallel programming >Heterogeneous parallel_for Template for CPU-GPU Chips
【24h】

Heterogeneous parallel_for Template for CPU-GPU Chips

机译:CPU-GPU芯片的模板异构并行_

获取原文
获取原文并翻译 | 示例
           

摘要

Heterogeneous processors, comprising CPU cores and a GPU, are the de facto standard in desktop and mobile platforms. In many cases it is worthwhile to exploit both the CPU and GPU simultaneously. However, the workload distribution poses a challenge when running irregular applications. In this paper, we present LogFit, a novel adaptive partitioning strategy for parallel loops, specially designed for applications with irregular data accesses running on heterogeneous CPU-GPU architectures. Our algorithm dynamically finds the optimal chunk size that must be assigned to the GPU. Also, the number of iterations assigned to the CPU cores are adaptively computed to avoid load unbalance. In addition, we also strive to increase the programmer's productivity by providing a high level template that eases the coding of heterogeneous parallel loops. We evaluate LogFit's performance and energy consumption by using a set of irregular benchmarks running on a heterogeneous CPU-GPU processor, an Intel Haswell. Our experimental results show that we outperform Oracle-like static and other dynamic state-of-the-art approaches both in terms of performance, up to 57%, and energy saving, up to 31%.
机译:包括CPU核心和GPU的异构处理器是桌面和移动平台的事实标准。在许多情况下,值得同时利用CPU和GPU是值得的。但是,在运行不规则应用时,工作负载分配构成挑战。在本文中,我们提出了LOGFIT,这是一种用于并行循环的新型自适应分区策略,专门为具有在异构CPU-GPU架构上运行的不规则数据访问的应用而设计。我们的算法动态地找到必须分配给GPU的最佳块大小。此外,分配给CPU核心的迭代的数量被自适应地计算以避免负载不平衡。此外,我们还努力通过提供高级模板来提高程序员的生产力,从而减轻异构并联环的编码。通过使用在异构CPU-GPU处理器上运行的一组不规则的基准,我们评估Logfit的性能和能耗,Intel Haswell。我们的实验结果表明,在性能方面均优于Oracle样的静态和其他动态最先进的方法,高达57%,节能高达31%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号