首页> 外文会议>European Conference on Parallel Computing >SIMD Vectorization of Straight Line FFT Code

【24h】

SIMD Vectorization of Straight Line FFT Code

机译：SIMD直线FFT代码的矢量化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents compiler technology that targets general purpose microprocessors augmented with SIMD execution units for exploiting data level parallelism. FFT kernels are accelerated by automatically vectorizing blocks of straight line code for processors featuring two-way short vector SIMD extensions like AMD's 3DNow! and Intel's SSE 2. Additionally, a special compiler backend is introduced which is able to (ⅰ) utilize particular code properties, (ⅱ) generate optimized address computation, and (ⅲ) apply specialized register allocation and instruction scheduling. Experiments show that automatic SIMD vectorization can achieve performance that is comparable to the optimal hand-generated code for FFT kernels. The newly developed methods have been integrated into the codelet generator of FFTW and successfully vectorized complicated code like real-to-halfcomplex non-power-of-two FFT kernels. The floatingpoint performance of FFTW'S scalar version has been more than doubled, resulting in the fastest FFT implementation to date.

机译：本文介绍了将通用微处理器增强的编译器技术，用于利用SIMD执行单元，以利用数据级并行性。 FFT内核通过自动为自动的矢量化块加速，用于处理器的直线编码，其双向短矢量SIMD扩展像AMD的3DNow！而英特尔的SSE 2.此外，介绍了一种能够（Ⅰ）的特殊编译器后端（Ⅰ）利用特定代码属性，（Ⅱ）生成优化的地址计算，（Ⅲ）应用专业寄存器分配和指令调度。实验表明，自动SIMD矢量化可以实现与FFT内核的最佳手工生成码相当的性能。新开发的方法已集成到FFTW的CodeloT发生器中，并成功矢量化复杂的代码，如实际复合的非功率 - 两个FFT内核。 FFTW的Scalar版本的浮点性能已经增加了一倍多，导致迄今为止最快的FFT实现。

著录项

来源
《European Conference on Parallel Computing 》|2003年||共10页
会议地点
作者
Stefan Kral; Franz Franchetti; Juergen Lorenz; Christoph W. Ueberhuber; Lecture Notes in Computer Science 2790;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. Efficient Utilization of Vector Registers to Improve FFT Performance on SIMD Microprocessors [J] . Feng YU, Ruifeng GE, Zeke WANG IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences . 2013 ,第7期

机译：向量寄存器的有效利用以提高SIMD微处理器上的FFT性能
2. Efficient Power Gating of SIMD Accelerators Through Dynamic Selective Devectorization in an HW/SW Codesigned Environment [J] . Kumar Rakesh, Martinez Alejandro, Gonzalez Antonio ACM Transactions on Architecture and Code Optimization . 2014 ,第3期

机译：在硬件/软件代码签名环境中通过动态选择性去矢量化对SIMD加速器进行有效的功率门控
3. A video DSP with a macroblock-level-pipeline and a SIMD type vector-pipeline architecture for MPEG2 CODEC [J] . Toyokura M., Kodama H. IEEE Journal of Solid-State Circuits . 1994 ,第12期

机译：具有用于MPEG2 CODEC的宏块级流水线和SIMD类型矢量流水线架构的视频DSP
4. SIMD Vectorization of Straight Line FFT Code [C] . Stefan Kral, Franz Franchetti, Juergen Lorenz, European Conference on Parallel Computing . 2003

机译：SIMD直线FFT代码的矢量化
5. New Algorithms for High-Throughput Decoding with Low-Density Parity-Check Codes using Fixed-Point SIMD Processors. [D] . Kennedy, JaWone Anthony. 2012

机译：使用定点SIMD处理器的低密度奇偶校验码高通量解码的新算法。
6. Multidirectional Scanning Model MUSCLE to Vectorize Raster Images with Straight Lines [O] . Ismail Rakip Karas, Bulent Bayram, Fatmagul Batuk, 2008

机译：多方向扫描模型MUSCLE用于以直线矢量化光栅图像
7. SIMD vectorization of straight line FFT code [O] . Stefan Kral, Franz Franchetti, Juergen Lorenz, 2003

机译：直线FFT代码的SIMD矢量化
8. Short Vector SIMD Code Generation for DSP Algorithms [R] . Franchetti, F. , Ueberhuber, C. , Pueschel, M. , 2002

机译：Dsp算法的短矢量sImD码生成

SIMD Vectorization of Straight Line FFT Code

摘要

著录项

相似文献

相关主题

期刊订阅