Large-scale parallelization based on CPU and GPU cluster for cosmological fluid simulations

Meng Chen; Wang Long; Cao Zongyan; Feng Long-long; Zhu Weishan

首页> 外文期刊>Computers & Fluids >Large-scale parallelization based on CPU and GPU cluster for cosmological fluid simulations

【24h】

Large-scale parallelization based on CPU and GPU cluster for cosmological fluid simulations

机译：基于CPU和GPU集群的大规模并行化，用于宇宙流体模拟

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present our parallel implementation for large-scale cosmological simulations of 3D supersonic fluids based on CPU and GPU clusters. Our developments are based on a CPU code named WIGEON. It is shown that, compared to the original sequential Fortran code, a speedup of 19-31 (depending on the specific GPU card) can be achieved on single GPU. Furthermore, our results show that the pure MPI parallelization scales very well up to 10 thousand CPU cores. In addition, a hybrid CPU/GPU parallelization scheme is introduced and a detailed analysis of the speedup and the scaling on the different number of CPU/GPU units are presented (up to 256 GPU cards due to computing resource limitation). Our high scalability and speedup rely on the domain decomposition approach, optimization of the algorithm and a series of techniques to optimize the CUDA implementation, especially in the memory access pattern on CPU. We believe this hybrid MPI + CUDA code can be an excellent candidate for 10 Peta-scale computing and beyond. (C) 2014 Elsevier Ltd. All rights reserved.

机译：我们为基于CPU和GPU群集的3D超音速流体的大规模宇宙学仿真提供了并行实现。我们的开发基于名为WIGEON的CPU代码。结果表明，与原始顺序Fortran代码相比，单个GPU可以实现19-31的加速（取决于特定的GPU卡）。此外，我们的结果表明，纯MPI并行化可很好地扩展到1万个CPU内核。此外，还引入了一种混合CPU / GPU并行化方案，并给出了对不同数量的CPU / GPU单元的加速和扩展的详细分析（由于计算资源的限制，最多256个GPU卡）。我们的高可扩展性和加速度依赖于域分解方法，算法优化和一系列技术来优化CUDA实现，尤其是在CPU上的内存访问模式中。我们相信，这种MPI + CUDA混合代码可以成为10 Peta级及更高级别计算的理想选择。（C）2014 Elsevier Ltd.保留所有权利。

著录项

来源
《Computers & Fluids》 |2015年第null期|共7页
作者
Meng Chen; Wang Long; Cao Zongyan; Feng Long-long; Zhu Weishan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算机的应用;
关键词
Cosmological hydrodynamics; WENO; GPU; Hierarchical memory; Heterogeneous; Large-scale;

机译：宇宙流体力学;WENO;GPU;分层记忆;异构;大尺度;

相似文献

外文文献
中文文献
专利

1. Large-scale parallelization based on CPU and GPU cluster for cosmological fluid simulations [J] . Meng Chen, Wang Long, Cao Zongyan, Computers & Fluids . 2015,第Null期

机译：基于CPU和GPU集群的大规模并行化，用于宇宙流体模拟
2. Parallel Simulation of Population Balance Model-Based Particulate Processes Using Multicore CPUs and GPUs [J] . Anuj V.Prakash, AnweshaChaudhury, RohitRamachandran Modelling and simulation in engineering . 2013,第1期

机译：使用多核CPU和GPU并行仿真基于人口平衡模型的微粒过程
3. CPU–GPU hybrid parallel strategy for cosmological simulations [J] . Yueqing Wang, Yong Dou, Song Guo, Concurrency and Computation . 2014,第3期

机译：用于宇宙学仿真的CPU-GPU混合并行策略
4. A hybrid parallel algorithm for computer simulation of Electrocardiogram based on a CPU-GPU cluster [C] . Shen Wenfeng, Sun Lianqiang, Wei Daming, IEEE/ACIS International Conference on Computer and Information Science . 2013

机译：基于CPU-GPU集群的心电图计算机仿真混合并行算法
5. GPU-Based Parallel Algorithms With Architecture-Aware Optimization for Large-Scale Process Simulation of Biological Pathways and High-Throughput Homologous Sequence Search [D] . Jiang, Hanyu. 2018

机译：基于GPU的并行算法，具有架构感知优化，用于生物途径和高通量同源序列搜索的大规模过程仿真
6. A novel CPU/GPU simulation environment for large-scale biologically realistic neural modeling [O] . Roger V. Hoang, Devyani Tanna, Laurence C. Jayet Bray, 2013

机译：用于大规模生物逼真的神经建模的新型CPU / GPU仿真环境
7. Highly efficient lattice Boltzmann multiphase simulations of immiscible fluids at high-density ratios on CPUs and GPUs through code generation [O] . Markus Holzer, Martin Bauer, Harald Köstler, 2021

机译：在CPU和GPU上通过代码生成高效的晶格Boltzmann多相模拟不混溶流体

Large-scale parallelization based on CPU and GPU cluster for cosmological fluid simulations

摘要

著录项

相似文献

相关主题

期刊订阅