...
首页> 外文期刊>Computers & geosciences >GPU acceleration of MPAS microphysics WSM6 using OpenACC directives: Performance and verification
【24h】

GPU acceleration of MPAS microphysics WSM6 using OpenACC directives: Performance and verification

机译:使用OPENACC指令GPU加速MPAS微物理WSM6:性能和验证

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this study, we accelerated a microphysics scheme embedded within the Model for Prediction Across Scales (MPAS), using OpenACC directives. As one of the most time-consuming physics parameterization schemes, we focused on parallelizing the Weather Research and Forecasting (WRF) single-moment 6-class microphysics scheme (WSM6) onto a graphics processing unit (GPU). We applied several essential methodologies to optimize the performance of WSM6 computation on the GPU, to minimize data transfer between the central processing unit (CPU) and GPU and to reduce the waste of GPU threads during computation. As a result, we achieved GPU runs using 1 T V100 that were 2.38 times faster than 48 message passing interface processes runs, on average. When porting the whole model onto the GPU, we achieved a x5.71 speed-up in WSM6 computation, except in I/ O communication. In addition, the precise verification method distinguished nonlinear chaotic error growth from differences introduced by GPU computation, considering the characteristics of the major output variables from WSM6. We then compared the difference between the CPU and the GPU runs to the difference between CPU runs with different compilers. Moreover, we examined bias in these differences, which can distort the climatology of model simulation. Our approach successfully passed the verification process, and this represents the successful application of GPU acceleration to realistic full-model integration of MPAS.
机译:在本研究中,我们加速了嵌入模型内的微神科方案,以跨尺度(MPA),使用OPENACC指令进行预测。作为最耗时的物理参数化方案之一,我们专注于将天气研究和预测(WRF)单机6级微型药物方案(WSM6)并行化到图形处理单元(GPU)上。我们应用了几种基本方法来优化WSM6计算对GPU的性能,以最大限度地减少中央处理单元(CPU)和GPU之间的数据传输,并在计算期间减少GPU线程的浪费。因此,我们使用1 T V100实现了GPU运行,其平均超过48消息传递接口进程的速度快2.38倍。将整个模型移植到GPU上时,我们在WSM6计算中实现了X5.71的加速,除了I / O通信。此外,考虑到WSM6的主要输出变量的特征,精确验证方法从GPU计算引入的差异区分非线性混沌误差生长。然后,我们比较CPU与GPU之间的差异运行到CPU与不同编译器之间的差异。此外,我们检查了这些差异中的偏见,这可能扭曲模型模拟的气候学。我们的方法成功通过了验证过程,这代表了GPU加速的成功应用,以实现MPA的现实全模型集成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号