...
首页> 外文期刊>IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences >A VLSI Array Processing Oriented Fast Fourier Transform Algorithm and Hardware Implementation
【24h】

A VLSI Array Processing Oriented Fast Fourier Transform Algorithm and Hardware Implementation

机译:面向VLSI阵列处理的快速傅立叶变换算法及硬件实现

获取原文
获取原文并翻译 | 示例

摘要

Many parallel Fast Fourier Transform (FFT) algorithms adopt multiple stages architecture to increase performance. However, data permutation between stages consumes volume memory and processing time. One FFT array processing mapping algorithm is proposed in this paper to overcome this demerit. In this algorithm, arbitrary 2~k butterfly units (BUs) could be scheduled to work in parallel on n = 2~s data (k = 0, 1,..., s - 1). Because no inter stage data transfer is required, memory consumption and system latency are both greatly reduced. Moreover, with the increasing of BUs, not only does throughput increase linearly, system latency also decreases linearly. This array processing orientated architecture provides flexible tradeoff between hardware cost and system performance. In theory, the system latency is (s x 2~(s-k) x t_(clk) and the throughput is n/(s x 2~(s-k) x t_(clk)), where t_(clk) is the system clock period. Based on this mapping algorithm, several 18-bit word-length 1024-point FFT processors implemented with TSMC0.18 μm CMOS technology are given to demonstrate its scalability and high performance. The core area of 4-BU design is 2.991 x 1.121 mm~2 and clock frequency is 326 MHz in typical condition (1.8 V, 25℃). This processor completes 1024 FFT calculation in 7.839 μs.
机译:许多并行快速傅里叶变换(FFT)算法采用多级体系结构来提高性能。但是,阶段之间的数据置换会消耗大量的内存和处理时间。为了克服这一缺点,本文提出了一种FFT阵列处理映射算法。在该算法中,可以安排任意2〜k个蝶形单元(BU)并行处理n = 2〜s个数据(k = 0、1,...,s-1)。由于不需要阶段间的数据传输,因此大大减少了内存消耗和系统等待时间。此外,随着BU的增加,不仅吞吐量线性增加,系统延迟也线性减少。这种面向阵列处理的体系结构在硬件成本和系统性能之间提供了灵活的折衷方案。从理论上讲,系统等待时间为(sx 2〜(sk)x t_(clk),吞吐量为n /(sx 2〜(sk)x t_(clk)),其中t_(clk)是系统时钟周期。基于该映射算法,给出了几种采用TSMC0.18μmCMOS技术实现的18位字长1024点FFT处理器,以证明其可扩展性和高性能,4-BU设计的核心面积为2.991 x 1.121 mm〜在典型条件(1.8 V,25℃)下,时钟频率为2,时钟频率为326 MHz,该处理器在7.839μs内完成1024 FFT计算。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号