首页> 外文会议>European Conference on Parallel Computing >SIMD Vectorization of Straight Line FFT Code
【24h】

SIMD Vectorization of Straight Line FFT Code

机译:SIMD直线FFT代码的矢量化

获取原文

摘要

This paper presents compiler technology that targets general purpose microprocessors augmented with SIMD execution units for exploiting data level parallelism. FFT kernels are accelerated by automatically vectorizing blocks of straight line code for processors featuring two-way short vector SIMD extensions like AMD's 3DNow! and Intel's SSE 2. Additionally, a special compiler backend is introduced which is able to (ⅰ) utilize particular code properties, (ⅱ) generate optimized address computation, and (ⅲ) apply specialized register allocation and instruction scheduling. Experiments show that automatic SIMD vectorization can achieve performance that is comparable to the optimal hand-generated code for FFT kernels. The newly developed methods have been integrated into the codelet generator of FFTW and successfully vectorized complicated code like real-to-halfcomplex non-power-of-two FFT kernels. The floatingpoint performance of FFTW'S scalar version has been more than doubled, resulting in the fastest FFT implementation to date.
机译:本文介绍了将通用微处理器增强的编译器技术,用于利用SIMD执行单元,以利用数据级并行性。 FFT内核通过自动为自动的矢量化块加速,用于处理器的直线编码,其双向短矢量SIMD扩展像AMD的3DNow!而英特尔的SSE 2.此外,介绍了一种能够(Ⅰ)的特殊编译器后端(Ⅰ)利用特定代码属性,(Ⅱ)生成优化的地址计算,(Ⅲ)应用专业寄存器分配和指令调度。实验表明,自动SIMD矢量化可以实现与FFT内核的最佳手工生成码相当的性能。新开发的方法已集成到FFTW的CodeloT发生器中,并成功矢量化复杂的代码,如实际复合的非功率 - 两个FFT内核。 FFTW的Scalar版本的浮点性能已经增加了一倍多,导致迄今为止最快的FFT实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号