This paper considers automatic restructuring of loops with conditional branching for parallel processing, especially a class of loops termed "conditional cyclic loops." A conditional cyclic loop possesses a dependence cycle caused by conditional branching across loop iterations, which makes it difficult to parallelize. In general, parallel execution of a conditional cyclic loop provides little benefit due to the need of solving a full-order nonlinear Boolean recurrence relation. However, the Boolean recurrence in practice is often of simpler forms. With the simpler forms, the number of possible predicate values of conditional branching is reduced drastically compared to a general conditional cyclic loop, These simple forms of conditional cyclic loops found in practice can be parallelized for O(p/ log p) speedup with p processors.
展开▼