首页> 外文会议>IEEE High Performance Extreme Computing Conference >GPU-Accelerated Discontinuous Galerkin Methods: 30x Speedup on 345 Billion Unknowns
【24h】

GPU-Accelerated Discontinuous Galerkin Methods: 30x Speedup on 345 Billion Unknowns

机译:GPU加速的不连续的Galerkin方法:30倍的加速超过3450亿未知数

获取原文

摘要

A discontinuous Galerkin method for the discretization of the compressible Euler equations, the governing equations of inviscid fluid dynamics, on Cartesian meshes is developed for use of Graphical Processing Units via OCCA, a unified approach to performance portability on multi-threaded hardware architectures. A 30x time-to-solution speedup over CPU-only implementations using non-CUDA-Aware MPI communications is demonstrated up to 1,536 NVIDIA V100 GPUs and parallel strong scalability is shown up to 6,144 NVIDIA V100 GPUs for a problem containing 345 billion unknowns. A comparison of CUDA-Aware MPI communication to non-GPUDirect communication is performed demonstrating an additional 24 % speedup on eight nodes composed of 32 NVIDIA V100 GPUs.
机译:用于离散的欧拉方程的离散化的不连续的Galerkin方法,在笛卡尔网格上开发了笛卡尔网格的控制型流体动力学的控制方程,以通过occa使用图形处理单元,是多线程硬件架构上的性能便携性的统一方法。使用非CUDA感知MPI通信仅通过CPU实现的30倍的时间超速加速MPI通信最多可达1,536个NVIDIA V100 GPU,并并行强大可扩展性最多可显示6,144个NVIDIA V100 GPU,用于包含3450亿未知数的问题。 CUDA感知MPI通信与非GPUDIRECT通信的比较进行了演示,演示了由32个NVIDIA V100 GPU组成的八个节点上的额外24%加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号