Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications

Frances J.; Otero B.; Bleda S.; Gallego S.; Neipp C.; Marquez A.; Belendez A.

首页> 外文期刊>Computer physics communications >Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications

【24h】

Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications

机译：多GPU和多CPU加速FDTD方案用于vibro声学应用

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The Finite-Difference Time-Domain (FDTD) method is applied to the analysis of vibroacoustic problems and to study the propagation of longitudinal and transversal waves in a stratified media. The potential of the scheme and the relevance of each acceleration strategy for massively computations in FDTD are demonstrated in this work. In this paper, we propose two new specific implementations of the bidimensional scheme of the FDTD method using multi-CPU and multi-GPU, respectively. In the first implementation, an open source message passing interface (OMPI) has been included in order to massively exploit the resources of a biprocessor station with two Intel Xeon processors. Moreover, regarding CPU code version, the streaming SIMD extensions (SSE) and also the advanced vectorial extensions (AVX) have been included with shared memory approaches that take advantage of the multi-core platforms. On the other hand, the second implementation called the multi-GPU code version is based on Peer-to-Peer communications available in CUDA on two GPUs (NVIDIA GTX 670). Subsequently, this paper presents an accurate analysis of the influence of the different code versions including shared memory approaches, vector instructions and multi-processors (both CPU and GPU) and compares them in order to delimit the degree of improvement of using distributed solutions based on multi-CPU and multi-GPU. The performance of both approaches was analysed and it has been demonstrated that the addition of shared memory schemes to CPU computing improves substantially the performance of vector instructions enlarging the simulation sizes that use efficiently the cache memory of CPUs. In this case GPU computing is slightly twice times faster than the fine tuned CPU version in both cases one and two nodes. However, for massively computations explicit vector instructions do not worth it since the memory bandwidth is the limiting factor and the performance tends to be the same than the sequential version with auto-vectorisation and also shared memory approach. In this scenario GPU computing is the best option since it provides a homogeneous behaviour. More specifically, the speedup of GPU computing achieves an upper limit of 12 for both one and two GPUs, whereas the performance reaches peak values of 80 GFlops and 146 GFlops for the performance for one GPU and two GPUs respectively. Finally, the method is applied to an earth crust profile in order to demonstrate the potential of our approach and the necessity of applying acceleration strategies in these type of applications. (C) 2015 Elsevier B.V. All rights reserved.

机译：有限差分时间域（FDTD）方法应用于偶联问题的分析，并研究分层介质中的纵向和横向波的传播。在这项工作中，证明了该方案的潜力和每个加速策略对FDTD中大规模计算的相关性。在本文中，我们分别提出了使用多CPU和多GPU的FDTD方法的两种新的特定实施方法。在第一实现中，已包括开源消息传递接口（OMPI），以便大量利用两个英特尔Xeon处理器的Biprocessor站的资源。此外，关于CPU代码版本，已将流SIMD扩展（SSE）以及高级向量扩展（AVX）包含在利用多核平台的共享存储器方法中。另一方面，称为多GPU代码版本的第二实施方式基于两个GPU上的CUDA中可用的对等通信（NVIDIA GTX 670）。随后，本文提出了对包括共享存储器方法，矢量指令和多处理器（CPU和GPU）的不同代码版本的影响的准确分析，并将其比较，以便根据基于的使用分布式解决方案的改进程度多CPU和多GPU。分析了两种方法的性能，已经证明了对CPU计算的共享存储器方案的添加提高了矢量指令的性能，其放大了有效地使用CPU的高速缓存存储器的模拟大小。在这种情况下，GPU计算比两个节点中的精细调谐CPU版本略速度稍微略微两次。然而，对于大规模计算，显式矢量指令不值得，由于存储器带宽是限制因素，并且性能趋于与具有自动矢量的顺序版本相同并且也是共享的存储方法。在这种情况下，GPU计算是最好的选择，因为它提供了同类行为。更具体地，GPU计算的加速实现了一个和两个GPU的12的上限，而性能分别达到80 gflops和146 gflops的峰值，分别用于一个GPU和两个GPU的性能。最后，该方法应用于地壳配置文件，以展示我们方法的潜力以及在这些类型的应用中应用加速策略的必要性。（c）2015 Elsevier B.v.保留所有权利。

著录项

来源
《Computer physics communications》 |2015年第null期|共9页
作者
Frances J.; Otero B.; Bleda S.; Gallego S.; Neipp C.; Marquez A.; Belendez A.;
展开▼
作者单位

Univ Alicante Dpto Fis Ingn Sistemas &

Teoria Senal E-03080 Alicante Spain;

Univ Politecn Cataluna Dept Arquitectura Comp Barcelona Spain;

Univ Alicante Dpto Fis Ingn Sistemas &

Teoria Senal E-03080 Alicante Spain;

Univ Alicante Dpto Fis Ingn Sistemas &

Teoria Senal E-03080 Alicante Spain;

Univ Alicante Dpto Fis Ingn Sistemas &

Teoria Senal E-03080 Alicante Spain;

Univ Alicante Dpto Fis Ingn Sistemas &

Teoria Senal E-03080 Alicante Spain;

Univ Alicante Dpto Fis Ingn Sistemas &

Teoria Senal E-03080 Alicante Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算机的应用;
关键词
FDTD; GPU computing; OMPI; SIMD extensions;

机译：FDTD;GPU计算;OMPI;SIMD扩展;

相似文献

外文文献
中文文献
专利

1. Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications [J] . Frances J., Otero B., Bleda S., Computer physics communications . 2015,第Null期

机译：多GPU和多CPU加速FDTD方案用于vibro声学应用
2. Using MATLAB’s Parallel Processing Toolbox for Multi-CPU and Multi-GPU Accelerated FDTD Simulations [J] . Weiss Alec J., Elsherbeni Atef Z., Demir Veysel, Applied Computational Electromagnetics Society journal . 2019,第5期

机译：使用MATLAB的并行处理工具箱用于多CPU和多GPU加速FDTD模拟
3. Financial applications on multi-CPU and multi-GPU architectures [J] . Emilio Castillo, Cristobal Camarero, Ana Borrego, Journal of supercomputing . 2015,第2期

机译：多CPU和多GPU架构上的金融应用
4. Accelerating hyper-spectral data processing on the multi-CPU and multi-GPU heterogeneous computing platform [C] . Lei. Zhang, Jiao. Bo. Gao, Yu. Hu, International Conference on Photonics and Optical Engineering . 2017

机译：加速多CPU和多GPU异构计算平台的超光谱数据处理
5. Accelerating evolution in FDTD simulations with distributed model order reduction techniques. [D] . Gorodetsky, Dmitry A. 2006

机译：利用分布式模型降阶技术加快FDTD仿真的发展。
6. A multi-GPU accelerated virtual-reality interaction simulation framework [O] . Xuqiang Shao, Weifeng Xu, Lina Lin, 2012

机译：多GPU加速的虚拟现实交互仿真框架
7. Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications [O] . Francés Monllor, Jorge, Otero, Beatriz, Bleda Pérez, Sergio, 2015

机译：适用于振动声学应用的多GPU和多CPU加速FDTD方案

Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅