The fast multipole method on parallel clusters, multicore processors, and graphics processing units

Darve E.; Cecka C.; Takahashi T.

首页> 外文期刊>Comptes rendus. Mecanique >The fast multipole method on parallel clusters, multicore processors, and graphics processing units

【24h】

The fast multipole method on parallel clusters, multicore processors, and graphics processing units

机译：并行集群，多核处理器和图形处理单元上的快速多极方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this article, we discuss how the fast multipole method (FMM) can be implemented on modern parallel computers, ranging from computer clusters to multicore processors and graphics cards (GPU). The FMM is a somewhat difficult application for parallel computing because of its tree structure and the fact that it requires many complex operations which are not regularly structured. Computational linear algebra with dense matrices for example allows many optimizations that leverage the regular computation pattern. FMM can be similarly optimized but we will see that the complexity of the optimization steps is greater. The discussion will start with a general presentation of FMMs. We briefly discuss parallel methods for the FMM, such as building the FMM tree in parallel, and reducing communication during the FMM procedure. Finally, we will focus on porting and optimizing the FMM on GPUs.

机译：在本文中，我们讨论了如何在现代并行计算机上实现快速多极方法（FMM），该并行计算机的范围从计算机群集到多核处理器和图形卡（GPU）。 FMM由于其树状结构以及需要许多没有规则结构的复杂操作的事实，因此对于并行计算而言，它有些困难。例如，具有密集矩阵的计算线性代数允许进行许多利用常规计算模式的优化。可以类似地优化FMM，但是我们会看到优化步骤的复杂性更大。讨论将从FMM的一般介绍开始。我们简要讨论了FMM的并行方法，例如并行构建FMM树，以及在FMM过程中减少通信。最后，我们将重点介绍在GPU上移植和优化FMM。

著录项

来源
《Comptes rendus. Mecanique》 |2011年第3期|共9页
作者
Darve E.; Cecka C.; Takahashi T.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类机械学（机械设计基础理论）;
关键词
Computer science; Fast multipole method; Parallel computer;

机译：计算机科学;快速多极方法;并行计算机;

相似文献

外文文献
中文文献
专利

1. The fast multipole method on parallel clusters, multicore processors, and graphics processing units [J] . Darve E., Cecka C., Takahashi T. Comptes rendus. Mecanique . 2011,第2a3期

机译：并行集群，多核处理器和图形处理单元上的快速多极方法
2. Parallel Implementations of Multilevel Fast Multipole Algorithm on Graphical Processing Unit Cluster for Large-scale Electromagnetics Objects [J] . Nghia Tran, Kilic Ozlem Applied Computational Electromagnetics Society journal . 2018,第2期

机译：大型电磁对象图形处理单元簇上多级快速多极算法的并行实现
3. Multicore Processors and Graphics Processing Unit Accelerators for Parallel Retrieval of Aerosol Optical Depth From Satellite Data: Implementation, Performance, and Energy Efficiency [J] . Liu Jia, Feld Dustin, Xue Yong, Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of . 2015,第5期

机译：从卫星数据并行检索气溶胶光学深度的多核处理器和图形处理单元加速器：实现，性能和能效
4. Graphics processing unit accelerated Fast Multipole Method - Fast Fourier Transform [C] . Q. Nguyen, V. Dang, O. Kilic IEEE International Symposium on Antennas and Propagation . 2013

机译：图形处理单元加速快速多极方法 - 快速傅里叶变换
5. Parallel Implementation of Resampling Methods for Particle Filtering on Graphics Processing Units [D] . Nicely, Matthew A. 2019

机译：图形处理单元粒子滤波重采样方法的平行实现
6. Parallelizing Affinity Propagation Using Graphics Processing Units for Spatial Cluster Analysis over Big Geospatial Data [O] . Xuan Shi -1

机译：使用图形处理单元对亲和力进行并行传播以对大地理空间数据进行空间聚类分析
7. Multicore processors and graphics processing unit accelerators for parallel retrieval of aerosol optical depth from satellite data: Implementation, performance, and energy efficiency [O] . Liu, Jia, Feld, Dustin, Xue, Yong, 2015

机译：用于从卫星数据并行检索气溶胶光学深度的多核处理器和图形处理单元加速器：实现，性能和能效
8. Analysis and Implementation of Particle-to-Particle (P2P) Graphics Processor Unit (GPU) Kernel for Black-Box Adaptive Fast Multipole Method. [R] . Haney, R. H., Darve, E., Ansari, M. P., 2015

机译：黑盒自适应快速多极子粒子到粒子图形处理器单元（GpU）核的分析与实现。

The fast multipole method on parallel clusters, multicore processors, and graphics processing units

摘要

著录项

相似文献

相关主题

期刊订阅