A program stream is executed at a first processing engine, the program stream including multiple iterations of a first load instruction. An instruction loop is executed at a second processing engine separate from the first processing engine substantially in parallel with an execution of the program stream at the first processing engine for prefetching data from memory to a buffer for one or more iterations of the first load instruction of the program stream. The instruction loop represents a subset of a sequence of instructions between iterations of the first load instruction that affect an address value associated with the first load instruction. A confidence value associated with the instruction loop is modified based on a prefetch performance of one or more iterations of the first load instruction and it is determined whether to terminate execution of the instruction loop based on the confidence value.
展开▼