首页> 外文会议>2nd International conference on computational methods for thermal problems. >HIGH PERFORMANCE COMPUTATION OF INCOMPRESSIBLE FLOW BY LATTICE BOLTZMANN METHOD ON MULTI-NODE GPU CLUSTER
【24h】

HIGH PERFORMANCE COMPUTATION OF INCOMPRESSIBLE FLOW BY LATTICE BOLTZMANN METHOD ON MULTI-NODE GPU CLUSTER

机译:基于多节点GPU簇的格子Boltzmann方法对不可压缩流进行高性能计算

获取原文
获取原文并翻译 | 示例

摘要

GPGPU has drawn much attention on accelerating non-graphic applications. The simulation by D3Q19 model of Lattice Boltzmann method was executed successfully on multi-node GPU cluster by using CUDA programming and MPI library. Comparison and analysis were made among the parallel results by ID, 2D and 3D domain decompositions. With 384 x 384 x 384 mesh system and 96 GPUs, the performance by 3D decomposition is about 3~4 times higher than that of ID decomposition. In order to hide the communication time, we introduced the overlapping technique between computation and communication. Using 8-96 GPUs, the performances increase by a factor about 1.1~1.3 with overlapping mode. As large-scaled computation of a flow around a sphere at Re=13000 was carried on successfully using mesh system 2000 x 1000 x 1000 and 100 GPUs. As a result, 6.0 hours were used for processing 100,000 time steps. Under this condition, the computational time (2.79 hours) and data communication time (3.06 hours) are almost same.
机译:GPGPU在加速非图形应用程序方面引起了很多关注。使用CUDA编程和MPI库在多节点GPU集群上成功执行了Lattice Boltzmann方法的D3Q19模型仿真。通过ID,2D和3D域分解对并行结果进行了比较和分析。在384 x 384 x 384网格系统和96个GPU的情况下,3D分解的性能比ID分解的性能高约3-4倍。为了隐藏通信时间,我们介绍了计算和通信之间的重叠技术。使用8-96 GPU,在重叠模式下性能可提高约1.1〜1.3。由于使用2000 x 1000 x 1000网格系统和100个GPU成功地对Re = 13000处的球体周围流进行了大规模计算。结果,6.0小时用于处理100,000个时间步。在这种情况下,计算时间(2.79小时)和数据通信时间(3.06小时)几乎相同。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号