Optimizing Sparse Matrix-Matrix Multiplication for the GPU

Dalton Steven; Olson Luke; Bell Nathan

首页> 外文期刊>ACM transactions on mathematical software >Optimizing Sparse Matrix-Matrix Multiplication for the GPU

【24h】

Optimizing Sparse Matrix-Matrix Multiplication for the GPU

机译：为GPU优化稀疏矩阵-矩阵乘法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Sparse matrix-matrix multiplication (SpGEMM) is a key operation in numerous areas from information to the physical sciences. Implementing SpGEMM efficiently on throughput-oriented processors, such as the graphics processing unit (GPU), requires the programmer to expose substantial fine-grained parallelism while conserving the limited off-chip memory bandwidth. Balancing these concerns, we decompose the SpGEMM operation into three highly parallel phases: expansion, sorting, and contraction, and introduce a set of complementary bandwidth-saving performance optimizations. Our implementation is fully general and our optimization strategy adaptively processes the SpGEMM workload row-wise to substantially improve performance by decreasing the work complexity and utilizing the memory hierarchy more effectively.

机译：从信息到物理科学，稀疏矩阵矩阵乘法（SpGEMM）是许多领域的关键操作。在面向吞吐量的处理器（例如图形处理单元（GPU））上有效地实现SpGEMM要求程序员在保持有限的片外内存带宽的同时，充分展现细粒度的并行性。为了解决这些问题，我们将SpGEMM操作分解为三个高度并行的阶段：扩展，排序和收缩，并引入了一组互补的节省带宽的性能优化。我们的实现是完全通用的，我们的优化策略通过逐行处理SpGEMM工作负载来通过降低工作复杂性和更有效地利用内存层次结构来显着提高性能。

著录项

来源
《ACM transactions on mathematical software》 |2015年第4期|25.1-25.20|共20页
作者
Dalton Steven; Olson Luke; Bell Nathan;
展开▼
作者单位

Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA;

Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA;

Google, Mountain View, CA 94043 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Algorithms; Performance; Parallel; sparse; GPU; matrix-matrix;

机译：算法;性能;并行;稀疏;GPU;矩阵-矩阵;

相似文献

外文文献
中文文献
专利

1. Register-based Implementation of the Sparse General Matrix-Matrix Multiplication on GPUs [J] . Junhong Liu, Xin He, Weifeng Liu, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2018,第1期

机译：基于寄存器的GPU稀疏常规矩阵矩阵乘法的实现
2. A framework for general sparse matrix-matrix multiplication on GPUs and heterogeneous processors [J] . Weifeng Liu, Brian Vinter Journal of Parallel and Distributed Computing . 2015,第NOVa期

机译：GPU和异构处理器上的通用稀疏矩阵矩阵乘法的框架
3. Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures [J] . Deveci Mehmet, Trott Christian, Rajamanickam Sivasankaran Parallel Computing . 2018,第octa期

机译：适用于多核和GPU架构的多线程稀疏矩阵矩阵乘法
4. Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format [C] . Shaohuai Shi, Qiang Wang, Xiaowen Chu IEEE International Conference on Parallel and Distributed Systems . 2020

机译：使用定制稀疏存储格式的高效稀疏密集矩阵矩阵乘法
5. Optimizing Tall-and-skinny Matrix-matrix Multiplication on GPUs [D] . Xiong, Nan 2018

机译：在GPU上优化高而瘦的矩阵矩阵乘法
6. Next-generation acceleration and code optimization for light transport in turbid media using GPUs [O] . Erik Alerstam, William Chun Yip Lo, Tianyi David Han, 2010

机译：下一代加速和代码优化使用GPU在混浊的介质中传输
7. Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format [O] . Shaohuai Shi, Qiang Wang, Xiaowen Chu 2020

机译：使用定制稀疏存储格式的高效稀疏密集矩阵矩阵乘法

Optimizing Sparse Matrix-Matrix Multiplication for the GPU

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅