A computer architecture which significantly reduces latency in fetching instructions from main memory includes a code-pump located proximate to the memory and a filter cache located proximate to the processor. The code pump reduces latency in fetching instructions by predicting possible instruction streams that may be executed by the processor and passing instructions from all possible streams to the filter cache. The code pump fetches instructions from the memory and partially decodes the instructions to determine their types. Instruction types which may change the flow of the program such as subroutine calls and conditional branches, cause the code pump to concurrently supply instructions from all flow paths that can be predicted from these instructions. To keep track of the possible flow paths, the code pump maintains a data structure which is a combination of multiple stack entries (for call instructions) and tree entries (for branch instructions) . The filter cache passes the addresses of fetched instructions back to the code pump. The code pump uses these addresses to determine which flow paths were followed and to deallocate any entries in the data structure which correspond to paths that were not followed.
展开▼