...
首页> 外文期刊>Computers & geosciences >Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures
【24h】

Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures

机译:适自适应的FWM2DA地震建模型号和GPU架构

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain modelling. We have reproduced this program using C programming language and upgraded its functionality for performing acoustic wave propagation of multiple source locations and allowing different grid spacing in x and z direction for the subsurface earth model. Performance of the upgraded version is improved by implementing inter-node and intra-node parallelization. Inter-node parallelization is implemented using MPI for efficiently utilizing the distributed memory while intra-node parallelization focuses on efficient utilization of underlying architecture's resources. Outcome of these programs are compared with that of the original FWM2DA program for Sigsbee2a model and found similar thus establishing the correctness of implementation done. The developed programs are tested using two layer subsurface earth model of 5000 X 5000 grid dimension on PARAM Shreshta system having Intel Xeon Gold 6148F Skylake CPU and NVIDIA Tesla V100 GPU. Ported programming models are evaluated for execution time for wavefield propagation of single source location using two layer subsurface earth model. Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C implementation are recorded over the sequential C program. The multi-core CPU program is further optimized and overall 29.37X performance is achieved with respect to the sequential C program. Performance of CUDA C program is also improved by making use of shared memory in GPU and 1.18X speedup is recorded with respect to the baseline CUDA C program. The numerical experiments demonstrate the effectiveness and robustness of the developed programs with high scalability and efficiency on multi-core CPU and GPU based HPC system.
机译:本文介绍了在多核CPU和GPU架构上朝向FWM2DA移植FWM2DA移植的工作。 FWM2DA是FORTRAN90顺序程序,其使用有限差分时域建模对2D地下地球模型执行单源位置的声波传播。我们已经使用C编程语言再现了该程序,并升级了其功能以执行多个源位置的声波传播,并允许X和Z方向的不同网格间隔进行地下地球模型。通过实现节点间和节点内的并行化来提高升级版本的性能。通过MPI实现节点间并行化,以便在内部节点并行化的同时有效地利用分布式存储器,同时侧重于基础架构的资源的有效利用率。这些程序的结果与SigsBee2A模型的原始FWM2DA程序进行了比较,发现类似的是建立完成的正确性。使用Intel Xeon Gold 6148F Skylake CPU和NVIDIA Tesla V100 GPU的Param Shreshta系统使用两层地下地球模型测试了开发的计划。使用两层地下地球模型评估单个源位置的波菲尔德传播的执行时间来评估移植的编程模型。 OpenMP的性能增益为8.6x,OpenACC的83.22x和用于CUDA C实现的107.77倍。在顺序C程序上记录。多核CPU程序进一步优化,并且对于顺序C程序,实现了29.37倍的性能。 CUDA C程序的性能也通过利用GPU中的共享内存来提高,并在基线CUDA C程序中记录1.18倍的加速。数值实验证明了开发计划具有高可扩展性和基于GPU的HPC系统的高可扩展性和效率的有效性和鲁棒性。

著录项

  • 来源
    《Computers & geosciences》 |2021年第1期|104637.1-104637.11|共11页
  • 作者单位

    Ctr Dev Adv Comp C DAC C DAC Innovat Pk Pune 411008 Maharashtra India;

    Ctr Dev Adv Comp C DAC C DAC Innovat Pk Pune 411008 Maharashtra India;

    Ctr Dev Adv Comp C DAC C DAC Innovat Pk Pune 411008 Maharashtra India;

    Ctr Dev Adv Comp C DAC C DAC Innovat Pk Pune 411008 Maharashtra India;

    Ctr Dev Adv Comp C DAC C DAC Innovat Pk Pune 411008 Maharashtra India;

    Ctr Dev Adv Comp C DAC C DAC Innovat Pk Pune 411008 Maharashtra India;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号