首页> 外文期刊>Parallel Computing >Parallel implementation and scalability analysis of 3D Fast Fourier Transform using 2D domain decomposition
【24h】

Parallel implementation and scalability analysis of 3D Fast Fourier Transform using 2D domain decomposition

机译:使用2D域分解的3D快速傅立叶变换的并行实现和可伸缩性分析

获取原文
获取原文并翻译 | 示例

摘要

3D FFT is computationally intensive and at the same time requires global or collective communication patterns. The efficient implementation of FFT on extreme scale computers is one of the grand challenges in scientific computing. On parallel computers with a distributed memory, different domain decompositions are possible to scale 3D FFT computation. In this paper, we argue that 2D domain decomposition is likely the best approach in terms of using a very large number of processors with reasonable data communication overhead. Specifically, we extend the data communication approach of Dmitruk et al. (2001) [21 ] previously used for ID domain decomposition, to 2D domain decomposition. A thorough quantitative analysis of the code performance is undertaken for different problem sizes and numbers of processors, including scalability, load balance, dependence on subdomain configuration (i.e., different numbers of subdomain in the two decomposed directions for a fixed total number of subdomains). We show that our proposed approach is faster than the existing attempts on 2D-decomposition of 3D FFTs by Pekurovsky (2007) [23] (p3dfft), Takahashi (2009) [24], and Li and Laizet (2010) [25] (2decomp.org) especially for the case of large problem size and large number of processors (our strategy is 28% faster than Peku-rovski's scheme, its closest competitor). We also show theoretically that our scheme performs better than the approach by Nelson et al. (1993) [22] up to a certain number of processors beyond which latency becomes and issue. We demonstrate that the speedup scales with the number of processors almost linearly before it saturates. The execution time on different processors differ by less than 5%, showing an excellent load balance. We further partitioned the execution time into computation, communication, and data copying related to the transpose operation, to understand how the relative percentage of the communication time increases with the number of processors. Finally, a theoretical complexity analysis is carried out to predict the scalability and its saturation. The complexity analysis indicates that the 2D domain decomposition will make it feasible to run a large 3D FFT on scalable computers with several hundred thousands processors.
机译:3D FFT计算量大,同时需要全局或集体通信模式。 FFT在极限计算机上的有效实现是科学计算中的重大挑战之一。在具有分布式内存的并行计算机上,可能会进行不同的域分解来扩展3D FFT计算。在本文中,我们认为,就使用大量具有合理数据通信开销的处理器而言,二维域分解可能是最好的方法。具体来说,我们扩展了Dmitruk等人的数据通信方法。 (2001)[21]以前用于ID域分解,到2D域分解。针对不同问题大小和处理器数量进行了代码性能的全面定量分析,包括可伸缩性,负载平衡,对子域配置的依赖(即,在固定子域总数的两个分解方向上子域数量不同)。我们证明,我们提出的方法比Pekurovsky(2007)[23](p3dfft),Takahashi(2009)[24]以及Li and Laizet(2010)[25](3D FFT的2D分解的现有尝试)更快。 2decomp.org),尤其是在问题规模大且处理器数量众多的情况下(我们的策略比最接近的竞争对手Peku-rovski的方案快28%)。从理论上我们还表明,我们的方案比Nelson等人的方法具有更好的性能。 (1993)[22]多达一定数量的处理器,超过这些处理器就会出现延迟并引起延迟。我们证明,在处理器饱和之前,加速率几乎与处理器数量成线性比例关系。不同处理器上的执行时间相差不到5%,显示出出色的负载平衡。我们进一步将执行时间划分为与转置操作相关的计算,通信和数据复制,以了解通信时间的相对百分比如何随处理器数量的增加而增加。最后,进行了理论上的复杂度分析,以预测可伸缩性及其饱和度。复杂性分析表明,二维域分解将使在具有数十万个处理器的可伸缩计算机上运行大型3D FFT成为可能。

著录项

  • 来源
    《Parallel Computing》 |2013年第1期|58-77|共20页
  • 作者

    Orlando Ayala; Lian-Ping Wang;

  • 作者单位

    Department of Mechanical Engineering, 126 Spencer laboratory, University of Delaware. Newark, DE 19716-3140, USA,Centro de Metodos Numericos en lngenieria, Escuela de Ingenieria y Ciencias Aplicadas, Universidad de Oriente, Puerto La Cruz, Venezuela;

    Department of Mechanical Engineering, 126 Spencer laboratory, University of Delaware. Newark, DE 19716-3140, USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    FFT; 2D decomposition; parallel computing;

    机译:FFT;二维分解并行计算;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号