We analyze single node performance of sparse matrix vector multiplication by investigating issues of data locality and fine grained parallelism. We examine the data locality characteristics of the compressed sparse row representation and consider improvements in locality through matrix permutation. Motivated by potential improvements in fine grained parallelism, we evaluate modified sparse matrix representations. The results lead to general conclusions about improving single node performance of sparse matrix vector multiplication in parallel libraries of sparse iterative solvers.
展开▼