首页> 外文期刊>Parallel Computing >Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers
【24h】

Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers

机译:在GPU加速的超级计算机上基于编译器的代码生成和几何多网格自动调整

获取原文
获取原文并翻译 | 示例

摘要

GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model like CUDA, OpenCL, or OpenACC. As such, in order to deliver portability across CPU-based and GPU-accelerated supercomputers, programmers are forced to write and maintain two versions of their applications or frameworks. In this paper, we explore the use of a compiler-based autotuning framework based on CUDA-CHiLL to deliver not only portability, but also performance portability across CPU- and GPU-accelerated platforms for the geometric multigrid linear solvers found in many scientific applications. We show that with autotuning we can attain near Roofline (a performance bound for a computation and target architecture) performance across the key operations in the miniGMG benchmark for both CPU- and GPU-based architectures as well as for a multiple stencil discretizations and smoothers. We show that our technology is readily interoperable with MPI resulting in performance at scale equal to that obtained via hand-optimized MPI+CUDA implementation. Published by Elsevier B.V.
机译:具有高带宽和计算能力的GPU已成为科学计算越来越受欢迎的目标。不幸的是,迄今为止,利用GPU的功能要求使用特定于GPU的编程模型,例如CUDA,OpenCL或OpenACC。这样,为了在基于CPU和GPU加速的超级计算机之间提供可移植性,程序员被迫编写和维护其应用程序或框架的两个版本。在本文中,我们探索了基于CUDA-CHiLL的基于编译器的自动调整框架的使用,不仅为许多科学应用中的几何多网格线性求解器提供了可移植性,而且还跨CPU和GPU加速平台提供了性能可移植性。我们展示了通过自动调整,我们可以在miniGMG基准测试中的关键操作上实现基于CPU和GPU的架构以及多个模板离散化和平滑化的关键操作附近的Roofline性能(计算和目标架构的性能限制)。我们证明了我们的技术很容易与MPI互操作,从而产生了与手工优化的MPI + CUDA实现所获得的规模相等的性能。由Elsevier B.V.发布

著录项

  • 来源
    《Parallel Computing》 |2017年第5期|50-64|共15页
  • 作者单位

    Lawrence Berkeley Natl Lab, Berkeley, CA 94721 USA;

    Lawrence Berkeley Natl Lab, Berkeley, CA 94721 USA;

    Lawrence Berkeley Natl Lab, Berkeley, CA 94721 USA;

    Lawrence Berkeley Natl Lab, Berkeley, CA 94721 USA;

    Lawrence Berkeley Natl Lab, Berkeley, CA 94721 USA;

    Univ Utah, Sch Comp, Salt Lake City, UT 84112 USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    GPU; Compiler; Autotuning; Multigrid;

    机译:GPU;编译器;自动调整;Multigrid;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号