首页> 外文期刊>Concurrency and computation: practice and experience >Parallel resolution of the 3D Helmholtz equation based on multi-graphics processing unit clusters
【24h】

Parallel resolution of the 3D Helmholtz equation based on multi-graphics processing unit clusters

机译:基于多图形处理单元簇的3D亥姆霍兹方程的并行分辨率

获取原文
获取原文并翻译 | 示例
           

摘要

The resolution of the 3D Helmholtz equation is required in the development of models related to a widernrange of scientific and technological applications. For solving this equation in complex arithmetic, the biconjugaterngradient (BCG) method is one of the most relevant solvers. However, this iterative method has arnhigh computational cost because of the large sparse matrix and the vector operations involved. In this paper,rna specific BCG method, adapted for the regularities of the Helmholtz equation is presented. This BCG isrnbased on the implementation of a novel format (named ‘Regular Format’) that allows the storage of thernlarge sparse matrix involved in the sparse matrix vector product in a compact form. The contribution of thisrnwork is twofold: (1) decreasing the memory requirements of the 3D Helmholtz equation using the ‘RegularrnFormat’ and (2) speeding up the resolution of the equation using high performance computing resources. Arnhybrid Message Passing Interface (MPI)-graphics processing unit CUDA GPU parallelization that is capablernof solving complex problems in short time has carried out (Fast-Helmholtz). Fast-Helmholtz combinesrnoptimizations at Message Passing Interface and GPU levels to reduce communications costs and to improvernthe exploitation of GPU architecture. This strategy makes it possible to extend the dimension of thernHelmholtz problem to be solved, thanks to the relevant reduction of memory requirements and runtime.
机译:在开发与更广泛的科学和技术应用有关的模型时,需要3D亥姆霍兹方程的分辨率。为了用复数算法求解该方程,双共轭梯度法(BCG)是最相关的求解器之一。然而,由于大的稀疏矩阵和所涉及的向量运算,这种迭代方法具有很高的计算成本。本文提出了适合Helmholtz方程正则性的rna特定BCG方法。此BCG基于一种新颖格式(称为“常规格式”)的实现,该格式允许以稀疏形式存储稀疏矩阵矢量乘积中涉及的最大稀疏矩阵。这项工作的作用有两个方面:(1)使用“ RegularrnFormat”降低3D Helmholtz方程的内存需求,以及(2)使用高性能计算资源来加快方程的分辨率。已经实现了能够在短时间内解决复杂问题的Arnhybrid消息传递接口(MPI)图形处理单元CUDA GPU并行化(Fast-Helmholtz)。 Fast-Helmholtz将消息传递接口和GPU级别的优化相结合,以降低通信成本并改善对GPU体系结构的利用。由于相应地减少了内存需求和运行时间,因此该策略可以扩展要解决的Helmholtz问题的范围。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号