In this paper we present a hardware architecture suitable for implementing discrete trigonometric transforms (DTT) including popular fast Discrete Cosine (DCT) and Sine (DST) transforms. The design method is modular and uses predesigned components to construct a transform system. A data shuffle network structure is presented in this work and we will show how it is used in conjunction with a partial column structure to build and compute the transforms. The scalability is based only on the transform size and the number of processing elements (PE). The transform throughput is determined by the number of PE and its associated shuffle network size. In this work we use a scalable DCT-II algorithm with a constant geometry structure to present the design methodology., The design approach can be applied to implement other discrete trigonometric transforms with similar property.
展开▼