Methods and architectures for coordinating the operation of a plurality of processing units in a parallel computing architecture wherein each processing unit is configured to process work elements of dynamically generated work groups using a resource (e.g. memory) associated with the work group. The method includes requesting a resource (associated with one of the work groups) from a main storage for use by a first processing unit which causes the resource to be stored in a temporary storage (e.g. cache); transmitting a notification message to a scheduling unit associated with a second processing unit indicating that the resource has been requested; in response to receiving the notification message at the scheduling unit associated with the second processing unit, determining whether a pool of pending work associated with the second processing unit comprises a pending work group associated with the resource; and if so, prioritizing processing of that work group by the second processing unit so as to obtain the resource from the temporary storage.
展开▼