The inverse discrete cosine transform (IDCT) is a significant component in today's JPEG and MPEG decoders. Of all the stages in the decoding process of a JPEG file, the IDCT is the most computationally intensive. Hence, we require fast and efficient implementations, either in software or hardware. Numerous individual designs for computing the ID-IDCT have been proposed. Our 2D-IDCT incorporates two of our ID-IDCT cores and a transpose network to provide a stall-free pipeline. In this paper, we describe a fast hardware implementation of a two-dimensional IDCT architecture that implements a variation of the modified Loeffler algorithm. This design is currently functionally verified, synthesized and tested on the Xilinx Virtex II FPGA. Our FPGA implementation has a throughput of over 800 M coefficients per second, implemented as an eight-wide pipeline with a clock frequency of 102 MHz. We suggest ideas to parallelize the design and further enhance performance. We also describe an ASIC design of the HDL model that operates at a clock frequency of 154 MHz using TSMC'S 0.18 mum CMOS technology. Our VHDL implementation is released as "open source "
展开▼
机译:离散余弦逆变换(IDCT)是当今JPEG和MPEG解码器的重要组成部分。在JPEG文件解码过程的所有阶段中,IDCT的计算强度最高。因此,我们需要以软件或硬件的方式快速有效地实施。已经提出了许多用于计算ID-IDCT的单独设计。我们的2D-IDCT集成了我们的两个ID-IDCT内核和一个转置网络,以提供无停顿的流水线。在本文中,我们描述了二维IDCT体系结构的快速硬件实现,该体系结构实现了改进的Loeffler算法的变体。目前,该设计已在Xilinx Virtex II FPGA上进行了功能验证,综合和测试。我们的FPGA实现具有每秒800 M系数以上的吞吐量,实现为8宽流水线,时钟频率为102 MHz。我们建议可以并行化设计并进一步提高性能的想法。我们还描述了HDL模型的ASIC设计,该设计使用TSMC的0.18微米CMOS技术以154 MHz的时钟频率运行。我们的VHDL实现以“开源”的形式发布
展开▼