This paper extends the consideration of fused floating-point arithmetic to operations that are frequently encountered in DSP. The fast Fourier transform is a case in point, it uses a complex butterfly operation. For a radix-2 implementation, the butterfly consists of a complex multiply followed by the complex addition and subtraction of the same pair of data. These butterfly operations can be implemented with two fused primitives, a fused two-term inner product and a fused add subtract unit. A floating-point fused FFT Butterfly unit is presented that performs single-precision butterfly floating-point operation in a time that is only 87% the time required for a conventional floating-point butterfly. When placed and routed in a 45 nm process, the fused FFT Butterfly unit occupied about 72% of the area needed to implement a floating-point butterfly using conventional floating-point adders and multipliers. The numerical result of the fused butterfly unit is more accurate because fewer rounding operations are needed.
展开▼