首页> 外文期刊>ACM transactions on mathematical software >Optimizing Sparse Matrix-Matrix Multiplication for the GPU
【24h】

Optimizing Sparse Matrix-Matrix Multiplication for the GPU

机译:为GPU优化稀疏矩阵-矩阵乘法

获取原文
获取原文并翻译 | 示例

摘要

Sparse matrix-matrix multiplication (SpGEMM) is a key operation in numerous areas from information to the physical sciences. Implementing SpGEMM efficiently on throughput-oriented processors, such as the graphics processing unit (GPU), requires the programmer to expose substantial fine-grained parallelism while conserving the limited off-chip memory bandwidth. Balancing these concerns, we decompose the SpGEMM operation into three highly parallel phases: expansion, sorting, and contraction, and introduce a set of complementary bandwidth-saving performance optimizations. Our implementation is fully general and our optimization strategy adaptively processes the SpGEMM workload row-wise to substantially improve performance by decreasing the work complexity and utilizing the memory hierarchy more effectively.
机译:从信息到物理科学,稀疏矩阵矩阵乘法(SpGEMM)是许多领域的关键操作。在面向吞吐量的处理器(例如图形处理单元(GPU))上有效地实现SpGEMM要求程序员在保持有限的片外内存带宽的同时,充分展现细粒度的并行性。为了解决这些问题,我们将SpGEMM操作分解为三个高度并行的阶段:扩展,排序和收缩,并引入了一组互补的节省带宽的性能优化。我们的实现是完全通用的,我们的优化策略通过逐行处理SpGEMM工作负载来通过降低工作复杂性和更有效地利用内存层次结构来显着提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号