Modern numerical simulations often require solving extremely large sparse linear systems. Solving these linear systems using Krylov iterative methods requires repeated sparse matrix-vector multiplications which can be the most computationally expensive part of the simulation. Since Graphics Processing Units (GPUs) provide a significant increase in floating point operations per second and memory bandwidth over conventional Central Processing Units (CPUs), performing sparse matrix-vector multiplications with these co-processors can decrease the amount of time required to solve a given linear system. In this paper, we investigate the performance of sparse matrix-vector multiplications across multiple GPUs. This is performed in the context of the solution of symmetric positive-definite linear systems using a conjugate-gradient iteration preconditioned with a least-squares polynomial preconditioner using the PETSc library.
展开▼