...
首页> 外文期刊>Computer physics communications >Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing
【24h】

Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

机译:共享内存并行计算的平滑粒子流体动力学仿真的计算性能

获取原文
获取原文并翻译 | 示例

摘要

The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code. (C) 2015 The Authors. Published by Elsevier B.V.
机译:研究了平滑粒子流体动力学(SPH)模拟的计算性能,用于三种类型的当前共享存储器并行计算机设备:许多集成核心(MIC)处理器,图形处理单元(GPU)和多核CPU。我们对每个芯片组的有效共享存储器分配方法特别感兴趣,因为有效的数据访问模式在计算GPU和MIC处理器和多核CPU的OpenMP编程之间的计算统一设备架构(CUDA)编程之间。我们首先为SPH码介绍几个并行实现技术,然后在我们的目标计算机架构上检查它们以确定每个处理器单元的最有效的算法。此外,我们还评估每个架构上的SPH模拟的有效计算性能和功率效率,因为它们是多设备环境中的整体性能的关键指标。在我们的基准测试中,发现GPU产生了作为独立设备单元的最佳算术性能,并提供最有效的功耗。多核CPU获得最有效的计算性能。 Xeon Phi上的MIC处理器的计算速度达到了两个Xeon CPU。这表明,使用MICS是由OpenMP并行化的多核CPU上的现有SPH代码的有吸引力的选择,因为它收益计算加速而无需对源代码进行重大更改。 (c)2015年作者。 elsevier b.v出版。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号