Techniques for generating highly optimized code for a pipelined microprocessor, the NS32532, and its fast floating point slave processor, the NS32580, are described in the context of the CTP family of optimizing compilers. All CTP compilers are constructed from three separate parts: a language-dependent compiler front-end, a shared global optimizer, and a shared code generator. In addition to most classical transformations, such as value propagation, redundant and dead code elimination, loop invariant code motion, global strength reduction and register allocation, the CTP compilers also perform less common optimizations, such as loop unrolling, basic block reorganization, code reordering, and profile feedback utilization. The relative influence of the different optimizations on the performance of the NS32532 using several standard benchmark programs is presented.
展开▼