VLIW DSP通过软件流水获得时间并行性,通过指令分簇获得空间并行性.指令的分簇本质上是资源分配问题.传统的指令分簇假设一条指令分到某一簇执行,而某些体系结构提供SIMD指令,传统的分簇算法对这类体系结构并不完全适用.提出的基于评估模型的分簇算法能对SIMD指令和普通指令进行合理的分簇.分簇之后,通过调度簇间传输指令,合成适当的簇间双字传输指令.由于SIMD和簇间双字传输的引入,以及较好的分簇决策,程序整体的调度延迟变短.对许多数字信号处理程序相对于没分簇的情况下的性能有2~3倍的性能提升,相对寄存器压力分簇算法有约7~10%性能的提升.%VLIW DSP obtain time parallelism through software pipelining, and obtain space parallelism through instruction clustering. The essence of clustering is resource allocation. Traditional clustering assumes that one instruction assigns to certain cluster, but that does not applicable to some architecture offering SIMD instructions. This article proposes an algorithm based on evaluation model can do well with the problem of clustering for ordinary instructions and SIMD instructions. By scheduling inter-cluster transfer instruction, we synthesize inter-cluster double word transfer instruction. With the help of SIMD instruction, inter-cluster double word transfer instruction and good clustering policy decision, we make the schedule latency shorter. For many DSP programs, comparing with no clustering, we obtain 2-3 times increase in performance, comparing with clustering algorithm based on register allocation, we obtain 7-10% increase in performance.
展开▼