首页> 外文期刊>Journal of Parallel and Distributed Computing >A model-driven blocking strategy for load balanced sparse matrix-vector multiplication on GPUs
【24h】

A model-driven blocking strategy for load balanced sparse matrix-vector multiplication on GPUs

机译:GPU上负载均衡的稀疏矩阵矢量乘法的模型驱动的阻塞策略

获取原文
获取原文并翻译 | 示例
           

摘要

Sparse Matrix-Vector multiplication (SpMV) is one of the key operations in linear algebra. Overcoming thread divergence, load imbalance and un-coalesced and indirect memory access due to sparsity and irregularity are challenges to optimizing SpMV on GPUs. In this paper we present a new Blocked Row-Column (BRC) storage format with a two-dimensional blocking mechanism that addresses these challenges effectively. It reduces thread divergence by reordering and blocking rows of the input matrix with nearly equal number of non-zero elements onto the same execution units (i.e., warps). BRC improves load balance by partitioning rows into blocks with a constant number of non-zeros such that different warps perform the same amount of work. We also present an approach to optimize BRC performance by judicious selection of block size based on sparsity characteristics of the matrix. A CUDA implementation of BRC outperforms NVIDIA CUSP and cuSPARSE libraries and other state-of-the-art SpMV formats on a range of unstructured sparse matrices from multiple application domains. The BRC format has been integrated with PETSc, enabling its use in PETSc's solvers. Furthermore, when partitioning the input matrix, BRC achieves near linear speedup on multiple GPUs.
机译:稀疏矩阵向量乘法(SpMV)是线性代数中的关键运算之一。克服由于稀疏性和不规则性导致的线程分散,负载不平衡以及无法访问和间接访问内存是优化GPU上SpMV的挑战。在本文中,我们提出了一种具有二维分块机制的新的分块行列(BRC)存储格式,可以有效应对这些挑战。它通过将具有几乎相等数量的非零元素的输入矩阵的行重新排序和阻塞到相同的执行单元(即扭曲)上来减少线程发散。 BRC通过将行划分为具有恒定数量的非零的块,从而使不同的扭曲执行相同的工作量,从而改善了负载平衡。我们还提出了一种通过基于矩阵的稀疏性特性明智地选择块大小来优化BRC性能的方法。在来自多个应用程序域的一系列非结构化稀疏矩阵上,BRC的CUDA实现优于NVIDIA CUSP和cuSPARSE库以及其他最新的SpMV格式。 BRC格式已与PETSc集成在一起,使其可以在PETSc的求解器中使用。此外,在分割输入矩阵时,BRC在多个GPU上实现了近乎线性的加速。

著录项

  • 来源
  • 作者单位

    Department of Computer Science and Engineering, The Ohio State University, United States;

    Department of Computer Science and Engineering, The Ohio State University, United States;

    Department of Computer Science and Engineering, The Ohio State University, United States;

    Department of Computer Science and Engineering, The Ohio State University, United States;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    SpMV; GPU; CUDA;

    机译:SpMV;GPU;卡达;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号