We investigate the costs and benefits of implementing memory interleaving in software. As our main contribution, we compare software memory interleaving to row-major allocation and logarithmic broadcasting. Our analysis demonstrates the clear superiority of software interleaving over row-major allocation in the presence of memory contention. Our analysis also indicates that the choice between software interleaving and logarithmic broadcasting is less clear, as it depends both on the type of synchronization used and the number of processors. We conclude that, on large-scale multiprocessors, software memory interleaving and lock-based synchronization is the most effective combination for reducing memory contention in matrix computations.
展开▼