首页> 外文会议>International Conference on Application-specific Systems, Architectures and Processors >Unleashing the performance potential of CPU-GPU platforms for the 3D atmospheric Euler solver

【24h】

Unleashing the performance potential of CPU-GPU platforms for the 3D atmospheric Euler solver

机译：释放3D大气Euler求解器的CPU-GPU平台的性能潜力

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

As a traditional application on various supercomputers, atmospheric modeling has long been suffering from the low performance efficiency. In this paper, we pick the 3D Euler equation solver (the most essential dynamic component for a non-hydrostatic atmospheric model) as the target application, and explore the maximum performance efficiency that can be achieved on CPU-GPU hybrid architectures. Besides presenting the suitable hybrid domain decomposition methodology and taking proper usage of tuning techniques for both the CPU and GPU parts, we further propose a novel GPU tuning technique, namely the customizable data caching mechanism with thread warp rescheduling scheme, which is specifically designed for the Euler solver. Combining all the optimizing approaches together, remarkable performance boost has been achieved on mainstream GPU architectures including Tesla Fermi C2050, K20×, K40 and K80. Especially, on the latest Tesla K80, we demonstrate a 31.64× speedup over the performance of 12-core E5-2697 CPU. In addition, based on a hybrid CPU-GPU node with two 12-core E5-2697 CPUs and two Tesla K80 GPUs, a sustained double-precision performance of 1.04 Tflops (16% of the peak) is achieved, which is remarkably higher than the efficiency of similar optimizing tasks based on heterogeneous platforms (strictly less than 10%, as demonstrated in the related work). In addition, a nearly linear weak scaling efficiency is achieved which demonstrate the effectiveness of our domain decomposition method.

机译：作为各种超级计算机上的传统应用程序，大气建模长期以来一直遭受着性能效率低下的困扰。在本文中，我们选择3D Euler方程求解器（非静压大气模型的最基本动态组件）作为目标应用，并探索在CPU-GPU混合体系结构上可以实现的最大性能效率。除了提供合适的混合域分解方法并适当使用CPU和GPU部件的调整技术外，我们还提出了一种新颖的GPU调整技术，即带有线程扭曲重新计划方案的可自定义数据缓存机制，该技术专门针对欧拉求解器。将所有优化方法结合在一起，在主流GPU架构（包括Tesla Fermi C2050，K20×，K40和K80）上实现了显着的性能提升。特别是，在最新的Tesla K80上，我们证明了12核E5-2697 CPU的性能提高了31.64倍。此外，基于具有两个12核E5-2697 CPU和两个Tesla K80 GPU的混合CPU-GPU节点，可实现1.04 Tflops（峰值的16％）的持续双精度性能，明显高于基于异构平台的类似优化任务的效率（严格小于10％，如相关工作所示）。另外，实现了接近线性的弱缩放效率，这证明了我们的域分解方法的有效性。

著录项

来源
《International Conference on Application-specific Systems, Architectures and Processors 》|2016年|41-49|共9页
会议地点
作者
Haohuan Fu; Jingheng Xu; Lin Gan; Chao Yang; Wei Xue; Wenlai Zhao; Wen Shi; Xinliang Wang; Guangwen Yang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Atmospheric modeling; Graphics processing units; Mathematical model; Computational modeling; Computer architecture; Tuning; Three-dimensional displays;

机译：大气建模;图形处理单元;数学模型;计算模型;计算机体系结构;调谐;三维显示;

相似文献

外文文献
中文文献
专利

1. A geometrical study of 3D incompressible Euler flows with Clebsch potentials - a long-lived Euler flow and its power-law energy spectrum [J] . Ohkitani K Physica, D. Nonlinear phenomena . 2008 ,第14a17期

机译：具有Clebsch势的3D不可压缩Euler流的几何研究-长寿命Euler流及其幂律能谱
2. High performance computing of stiff bubble collapse on CPU-GPU heterogeneous platform [J] . Dubois Remy, da Silva Eric Goncalves, Parnaudeau Philippe Computers & mathematics with applications . 2021 ,第Octa1期

机译：CPU-GPU异构平台刚性泡沫塌陷高性能计算
3. On the Performance and Energy Consumption of Molecular Dynamics Applications for Heterogeneous CPU-GPU Platforms Based on Gromacs [J] . A. Poghosyan, H. Astsatryan, W. Narsisian, Cybernetics and information technologies: CIT . 2017 ,第5期

机译：基于Gromacs的异构CPU-GPU平台分子动力学应用程序的性能和能耗
4. Unleashing the performance potential of CPU-GPU platforms for the 3D atmospheric Euler solver [C] . Haohuan Fu, Jingheng Xu, Lin Gan, IEEE International Conference on Application-specific Systems, Architectures and Processors . 2016

机译：释放3D大气欧拉求解器的CPU-GPU平台的性能潜力
5. Optimization techniques for mapping algorithms and applications onto CUDA GPU platforms and CPU-GPU heterogeneous platforms. [D] . Wu, Jing. 2014

机译：用于将算法和应用程序映射到CUDA GPU平台和CPU-GPU异构平台的优化技术。
6. Potentially singular solutions of the 3D axisymmetric Euler equations [O] . Guo Luo, Thomas Y. Hou 2014

机译：3D轴对称Euler方程的潜在奇异解
7. High Performance FFT Based Poisson Solver on a CPU-GPU Heterogeneous Platform [O] . Jing Wu, Joseph Jaja 2013

机译：在CPU-GPU异构平台上基于高性能FFT的Poisson解算器
8. 3D Unstructured Mesh Euler Solver Based on the Fourth-Order CESE Method. [R] . D. L. Bilyeu J. Cambier S. J. Yu 2013

机译：基于四阶CEsE方法的三维非结构网格Euler求解器。

Unleashing the performance potential of CPU-GPU platforms for the 3D atmospheric Euler solver

摘要

著录项

相似文献

相关主题

期刊订阅