Multidisciplinary simulation acceleration using multiple shared memory graphical processing units

Kemal Jonathan Y.; Davis Roger L.; Owens John D.

首页> 外文期刊>Experimental Mechanics >Multidisciplinary simulation acceleration using multiple shared memory graphical processing units

【24h】

Multidisciplinary simulation acceleration using multiple shared memory graphical processing units

机译：使用多个共享内存图形处理单元的多学科仿真加速

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this article, we describe the strategies and programming techniques used in porting a multidisciplinary fluid/thermal interaction procedure to graphical processing units (GPUs). We discuss the strategies for selecting which disciplines or routines are chosen for use on GPUs rather than CPUs. In addition, we describe the programming techniques including use of Compute Unified Device Architecture (CUDA), mixed-language (Fortran/C/CUDA) usage, Fortran/C memory mapping of arrays, and GPU optimization. We solve all equations using the multi-block, structured grid, finite volume numerical technique, with the dual time-step scheme used for unsteady simulations. Our numerical solver code targets CUDA-capable GPUs produced by NVIDIA. We use NVIDIA Tesla C2050/C2070 GPUs based on the Fermi architecture and compare our resulting performance against Intel Xeon X5690 CPUs. Individual solver routines converted to CUDA typically run about 10 times faster on a GPU for sufficiently dense computational grids. We used a conjugate cylinder computational grid and ran a turbulent steady flow simulation using four increasingly dense computational grids. Our densest computational grid is divided into 13 blocks each containing 1033x1033 grid points, for a total of 13.87 million grid points or 1.07 million grid points per domain block. Comparing the performance of eight GPUs to that of eight CPUs, we obtain an overall speedup of about 6.0 when using our densest computational grid. This amounts to an 8-GPU simulation running about 39.5 times faster than running than a single-CPU simulation.

机译：在本文中，我们描述了将多学科的流体/热相互作用过程移植到图形处理单元（GPU）中使用的策略和编程技术。我们讨论选择用于GPU而不是CPU的学科或例程的策略。此外，我们描述了编程技术，包括使用计算统一设备体系结构（CUDA），混合语言（Fortran / C / CUDA）用法，阵列的Fortran / C内存映射以及GPU优化。我们使用多块结构化网格有限体积数值技术求解所有方程，并使用非稳态模拟的双重时间步方案。我们的数值求解器代码针对NVIDIA生产的具有CUDA功能的GPU。我们使用基于Fermi架构的NVIDIA Tesla C2050 / C2070 GPU，并将我们得到的性能与Intel Xeon X5690 CPU进行比较。对于足够密集的计算网格，转换为CUDA的单个求解器例程通常在GPU上运行速度大约快10倍。我们使用了共轭圆柱体计算网格，并使用四个密度越来越大的计算网格进行了湍流稳定流模拟。我们最密集的计算网格分为13个块，每个块包含1033x1033网格点，每个域块总计1387万个网格点或107万个网格点。将八个GPU的性能与八个CPU的性能进行比较，使用最密集的计算网格时，我们获得了约6.0的总体加速。这相当于8-GPU仿真的运行速度比单CPU仿真快39.5倍。

著录项

来源
《Experimental Mechanics》 |2016年第4期|486-508|共23页
作者
Kemal Jonathan Y.; Davis Roger L.; Owens John D.;
展开▼
作者单位

Univ Calif Davis, Dept Mech & Aerosp Engn, 1 Shields Ave, Davis, CA 95616 USA;

Univ Calif Davis, Dept Mech & Aerosp Engn, 1 Shields Ave, Davis, CA 95616 USA;

Univ Calif Davis, Dept Elect & Comp Engn, Engn & Entrepreneurship, Davis, CA 95616 USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Computational fluid dynamics; CFD; GPU; CUDA; parallel computing;

机译：计算流体力学;CFD;GPU;CUDA;并行计算;

相似文献

外文文献
中文文献
专利

1. Acceleration strategies for explicit finite element analysis of metal powder-based additive manufacturing processes using graphical processing units [J] . Mozaffar Mojtaba, Ndip-Agbor Ebot, Lin Stephen, Computational Mechanics: Solids, Fluids, Fracture Transport Phenomena and Variational Methods . 2019,第3期

机译：使用图形加工单元的金属粉末添加剂制造工艺明确有限元分析的加速策略
2. Acceleration of hidden Markov model fitting using graphical processing units, with application to low-frequency tremor classification [J] . Stoltz Marnus, Stoltz Gene, Obara Kazushige, Computers & geosciences . 2021,第Nova期

机译：使用图形处理单元加速隐马尔可夫模型拟合，应用于低频震颤分类
3. Acceleration of the GAMESS-UK electronic structure package on graphical processing units [J] . Wilkinson K.A., Sherwood P., Guest M.F., Journal of Computational Chemistry: Organic, Inorganic, Physical, Biological . 2011,第10期

机译：在图形处理单元上加速GAMESS-UK电子结构包
4. Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units [C] . Jonathan Y. Kemal, Roger L. Davis, John D. Owens AIAA infotech@aerospace conference;AIAA Sci'Tech forum . 2015

机译：使用多个共享内存图形处理单元的多学科仿真加速
5. Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units. [D] . Kemal, Jonathan Yashar. 2014

机译：使用多个共享内存图形处理单元的多学科仿真加速。
6. ReaDDyMM: Fast Interacting Particle Reaction-Diffusion Simulations Using Graphical Processing Units [O] . Johann Biedermann, Alexander Ullrich, Johannes Schöneberg, 2015

机译：ReaDDyMM：使用图形处理单元的快速相互作用的粒子反应-扩散模拟
7. Multidisciplinary simulation acceleration using multiple shared memory graphical processing units [O] . Jonathan Y Kemal, Roger L Davis, John D Owens 2016

机译：使用多个共享内存图形处理单元的多学科仿真加速度

Multidisciplinary simulation acceleration using multiple shared memory graphical processing units

摘要

著录项

相似文献

相关主题

期刊订阅