We propose an alternative approach to branch resolution based on the earlier work on decoupled memory architectures. Branch decoupling is a technique to decouple a single instruction stream program into two streams. One stream is solely dedicatedto resolving branches as early as possible (both the branch condition and the branch target). The resolved branch targets are consumed by the other computing stream through a queue. We have proposed a compiler based, static branch decoupling methodologyearlier In this paper we propose a dynamic branch decoupled (DBD) architecture. Simulations show a speedup of 25.6% for SPEC95 integer benchmarks and 6.1% for SPEC95 FP benchmarks over a 2-level adaptive branch predictor. The average number of branchpenalty cycles per instruction for DBD reduces to .0475 compared to .0835 for the 2-level branch predictor.
展开▼