cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs

机译：Cusfft：GPU上的高性能稀疏快速傅里叶变换算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Fast Fourier Transform (FFT) is one of the most important numerical tools widely used in many scientific and engineering applications. The algorithm performs O(nlogn) operations on n input data points in order to calculate only small number of k large coefficients, while the rest of n - k numbers are zero or negligibly small. The algorithm is clearly inefficient, when n points input data lead to only k <;Z n non-zero coefficients in the transformed domain. MIT in 2012 developed a sparse FFT (sFFT) algorithm that provides a solution to this problem. In this paper, we explore the challenges and propose effective solutions to efficiently port sFFT to massively parallel processors, such as GPUs, using CUDA. GPGPUs are being increasingly adopted as popular HPC platforms because of their tremendous computing power and remarkable cost efficiency. However, sFFT algorithm is a complex and computationally challenging memory-bound algorithm that is not straightforward to be implemented on GPUs. In this paper, we present some of the optimization strategies such as index coalescing, loop splitting, asynchronous data layout transformation, linear time selection algorithm that are required to compute sFFT on such massively parallel architectures. Our CUDA-based sFFT, cusFFT, performs over 10x faster than the state-of-the-art cuFFT library on GPUs and over 28x faster than the parallel FFTW on multicore CPUs.

机译：快速的傅里叶变换（FFT）是许多科学和工程应用中广泛应用于最重要的数值工具之一。该算法在n个输入数据点上执行O（nlogn）操作，以便仅计算少量k大系数，而N-k编号的其余部分是零或忽略的小。当N点输入数据导致变换域中的k <; z n非零系数时，该算法显然是低效的。 2012年的麻省理工学院开发了一种稀疏的FFT（SFFT）算法，为此问题提供了解决方案。在本文中，我们探讨了挑战，并提出了有效的解决方案，以便使用CUDA将SFFT与大规模平行处理器（如GPU）有效地端口SFFT。由于其巨大的计算能力和显着的成本效率，GPGPU越来越多地被作为流行的HPC平台采用。然而，SFFT算法是一种复杂的和计算挑战的内存绑定算法，其在GPU上不必直接实现。在本文中，我们介绍了一些优化策略，如索引聚结，环形分裂，异步数据布局转换，在这种大规模并行架构上计算SFFT所需的线性时间选择算法。我们的CUDA系列Cusfft，Cusfft，比GPU上的最先进的袖扣库更快地执行超过10倍，而不是多核CPU上的并行FFTW速度超过28倍。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium》|2016年|575p|共10页
会议地点
作者
Cheng Wang; Sunita Chandrasekaran; Barbara Chapman;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13-53;
关键词
Graphics processing units; Instruction sets; Libraries; Frequency-domain analysis; Optimization; Computer architecture; Kernel;

机译：图形处理单元;指令集;库;频域分析;优化;计算机架构;内核;
入库时间 2022-08-21 04:32:34

相似文献

外文文献
中文文献
专利

1. On Performance of Sparse Fast Fourier Transform and Enhancement Algorithm [J] . Gui-Lin Chen, Shang-Ho Tsai, Kai-Jiun Yang IEEE Transactions on Signal Processing . 2017,第21期

机译：稀疏快速傅里叶变换的性能及增强算法
2. Application Research on Sparse Fast Fourier Transform Algorithm in White Gaussian Noise [J] . Liu Zhong, Li Lichun, Li Huiqi Procedia Computer Science . 2017,第1期

机译：稀疏快速傅里叶变换算法在白高斯噪声中的应用研究
3. FAST COMPUTATION OF THE MULTIDIMENSIONAL DISCRETE FOURIER TRANSFORM AND DISCRETE BACKWARD FOURIER TRANSFORM ON SPARSE GRIDS [J] . YING JIANG, YUESHENG XU Mathematics of computation . 2014,第289期

机译：稀疏网格上多维离散傅里叶变换和离散向后傅里叶变换的快速计算
4. cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs [C] . Cheng Wang, Sunita Chandrasekaran, Barbara Chapman IEEE International Parallel and Distributed Processing Symposium . 2016

机译：cusFFT：GPU上的高性能稀疏快速傅立叶变换算法
5. A framework for cooperative wideband spectrum sensing using the Robust Fast Fourier Aliasing based Sparse Transform (R-FFAST). [D] . Thibodeau, Brian M. 2016

机译：使用基于鲁棒快速傅里叶别名的稀疏变换（R-FFAST）进行协作宽带频谱感测的框架。
6. GPU-accelerated non-uniform fast Fourier transform-based compressive sensing spectral domain optical coherence tomography [O] . Daguang Xu, Yong Huang, Jin U. Kang -1

机译：基于GPU加速的非均匀快速傅里叶变换的压缩感知光谱域光学相干层析成像
7. Reviews of bearing vibration measurement using fast Fourier transform and enhanced fast Fourier transform algorithms [O] . Hsiung-Cheng Lin, Yu-Chen Ye 2019

机译：使用快速傅里叶变换和增强的快速傅里叶变换算法评估轴承振动测量
8. Comparison of Arithmetic Requirements for the PFA (Prime Factor Algorithm), WFTA (Winograd Fourier Transform Algorithm), SWIFT, MFFT (Mixed Radix Fast Fourier Transform), FFT (Fast Fourier Transform) and DFT (Discrete Fourier Transform) Algorithms. [R] . Hicks, R. C. 1982

机译：pFa（素因子算法），WFTa（Winograd傅立叶变换算法），sWIFT，mFFT（混合基线快速傅里叶变换），FFT（快速傅立叶变换）和DFT（离散傅立叶变换）算法的算术要求的比较。

cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs

摘要

著录项

相似文献

相关主题

期刊订阅