A systematic folding transformation technique to fold any arbitrary signal processing algorithm data-flow graph to a hardware data-flow architecture, for a specified folding set and specified technology constraints, is presented. The folding set specifies the processor and the time partition at which the task is executed and is typically obtained by performing scheduling and resource allocation for the algorithm data-flow graph and the specified iteration period. The constraints imposed on the hardware architecture are also assumed to be known. The technique is used to derive the control circuitry of the hardware architecture. The authors derive conditions for the validity of a specified folding set, and present approaches to generate the dedicated architecture using systematic folding of tasks to operators. They propose automatic retiming and pipelining of algorithms described by data-flow graphs for folding. The folding algorithm is applied after preprocessing the data-flow graph for automated pipelining and retiming.
展开▼