首页> 外文期刊>Journal of Parallel and Distributed Computing >Algorithm level power efficiency optimization for CPU-GPU processing element in data intensive SIMD/SPMD computing
【24h】

Algorithm level power efficiency optimization for CPU-GPU processing element in data intensive SIMD/SPMD computing

机译:数据密集型SIMD / SPMD计算中CPU-GPU处理元件的算法级功率效率优化

获取原文
获取原文并翻译 | 示例

摘要

Power efficiency investigation has been required in each level of a High Performance Computing (HPC) system because of the increasing computation demands of scientific and engineering applications. Focusing on handling the critical design constraints in the software level that run beyond a parallel system composed of huge numbers of power-hungry components, we optimize HPC program design in order to achieve the best possible power performance on the target hardware platform. The power performance of a CUDA Processing Element (PE) is determined by both hardware factors including power features of each component including with CPU, GPU, main memory and PCI buses, and their interconnection architecture; and software factors including algorithm design and the character of executable instructions performed on it. In this paper, approaches to model and evaluate the power consumption of large scale SIMD computation by CUDA PEs on multi-core and GPU platforms are introduced. The model allows obtaining design characteristic values at the early programming stage, thus benefitting programmers by providing the necessary environment information for choosing the best power-efficient alternative. Based on the model, CPU Dynamic frequency scaling (DFS) can be applied on CUDA PE architecture that adjusts CPU frequency to enhance power efficiency of the entire PE without compromising its computing performance. The power model and power efficiency improvements of the new designs have been validated by measuring the new programs on the real GPU multiprocessing system.
机译:由于科学和工程应用对计算的需求不断增加,因此在高性能计算(HPC)系统的每个级别中都需要进行功率效率调查。我们专注于处理软件级别的关键设计约束,这些约束超出了由大量耗电组件组成的并行系统的运行范围,我们优化了HPC程序设计,以在目标硬件平台上实现最佳的电源性能。 CUDA处理元件(PE)的电源性能取决于两个硬件因素,包括每个组件的电源功能(包括CPU,GPU,主内存和PCI总线)及其互连体系结构;以及软件因素,包括算法设计和对其执行的可执行指令的特征。本文介绍了在多核和GPU平台上对CUDA PE进行大规模SIMD计算的功耗进行建模和评估的方法。该模型允许在编程的早期阶段获得设计特征值,从而通过提供必要的环境信息以选择最佳的高能效替代方案而使程序员受益。基于该模型,CPU动态频率缩放(DFS)可以应用于CUDA PE体系结构,该体系结构可以调整CPU频率以增强整个PE的电源效率,而不会影响其计算性能。通过在真实的GPU多处理系统上测量新程序,已经验证了新设计的电源模型和电源效率改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号