The Fourier transform is essential for many image processing and scientific computing techniques.This paper presented an implementation to accelerate FFT computation based on CUDA.Based on the analysis of the GPU architecture and algorithm parallelism feature, brought a mapping strategy used multithread, and explored the optimization in memory hierarchy.The results on CUDA show an improvement, the average speedup reaches 2 ~ 6X compared with CUFFT supplied by NVIDIA library.%针对快速傅里叶算法FFT在图形图像处理和科学计算领域的重要作用,提出了一种基于CUDA的高速FFT计算方法,在分析GPU硬件平台执行模式及FFT算法并行性特征的基础上,采用多线程并行的映射方法实现算法,并从存储层次优化算法.实验结果表明了该算法的高效性,且优化后的FFT加速比能达到CUFFT库加速比的2~6倍.
展开▼