An efficient template for the implementation on distributed-memory multiprocessors of iterated parallel loops, i.e. parallel loops nested in a sequential loop, is presented. The template is explicitly designed to smooth unbalanced processor workloads deriving from loops whose iterations are characterized by highly varying execution times. Experiments conducted shows performance gains w.r.t. HPF-like language supports.
展开▼