Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications

机译：用于图形应用的GPU上的快速稀疏矩阵-向量乘法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sparse matrix-vector multiplication (SpMV) is a widely used computational kernel. The most commonly used format for a sparse matrix is CSR (Compressed Sparse Row), but a number of other representations have recently been developed that achieve higher SpMV performance. However, the alternative representations typically impose a significant preprocessing overhead. While a high preprocessing overhead can be amortized for applications requiring many iterative invocations of SpMV that use the same matrix, it is not always feasible -- for instance when analyzing large dynamically evolving graphs. This paper presents ACSR, an adaptive SpMV algorithm that uses the standard CSR format but reduces thread divergence by combining rows into groups (bins) which have a similar number of non-zero elements. Further, for rows in bins that span a wide range of non zero counts, dynamic parallelism is leveraged. A significant benefit of ACSR over other proposed SpMV approaches is that it works directly with the standard CSR format, and thus avoids significant preprocessing overheads. A CUDA implementation of ACSR is shown to outperform SpMV implementations in the NVIDIA CUSP and cuSPARSE libraries on a set of sparse matrices representing power-law graphs. We also demonstrate the use of ACSR for the analysis of dynamic graphs, where the improvement over extant approaches is even higher.

机译：稀疏矩阵向量乘法（SpMV）是一种广泛使用的计算内核。稀疏矩阵最常用的格式是CSR（压缩稀疏行），但是最近开发了许多其他表示形式，可以实现更高的SpMV性能。但是，替代表示通常会带来很大的预处理开销。对于需要使用相同矩阵进行SpMV的多次迭代调用的应用程序，可以分摊高昂的预处理开销，但这并不总是可行的-例如，在分析大型动态演化图时。本文介绍了ACSR，这是一种自适应SpMV算法，它使用标准CSR格式，但通过将行合并为具有相似数量的非零元素的组（bin）来减少线程发散。此外，对于跨非零计数范围广泛的bin中的行，利用了动态并行性。与其他提议的SpMV方法相比，ACSR的显着优势是它可以直接与标准CSR格式一起使用，从而避免了巨大的预处理开销。在一组表示幂律图的稀疏矩阵上，显示出ACSR的CUDA实现优于NVIDIA CUSP和cuSPARSE库中的SpMV实现。我们还演示了使用ACSR进行动态图分析，与现有方法相比，该方法的改进甚至更高。

著录项

来源
《International Conference for High Performance Computing, Networking, Storage and Analysis》|2014年|781-792|共12页
会议地点
作者
Ashari Arash; Sedaghati Naser; Eisenlohr John; Parthasarath Srinivasan; Sadayappan P.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
graph theory; graphics processing units; mathematics computing; matrix multiplication; parallel architectures; ACSR; CSR format; CUDA implementation; GPUs; NVIDIA CUSP libraries; SpMV approach; adaptive SpMV algorithm; compressed sparse row; computational kernel; cuSPARSE libraries; dynamic graphs; dynamic parallelism; fast sparse matrix-vector multiplication; graph applications; iterative invocations; power-law graphs; thread divergence; Heuristic algorithms; Instruction sets; Kernel; Parallel processing; Sparse matrices; Standards; Vectors; ACSR; CSR; GPU; HYB; SpMV;

机译：图论;图形处理单元;数学计算;矩阵乘法;并行体系结构; ACSR; CSR格式; CUDA实现; GPU; NVIDIA CUSP库; SpMV方法;自适应SpMV算法;压缩稀疏行;计算内核; cuSPARSE库;动态图;动态并行性;快速稀疏矩阵矢量乘法;图应用;迭代调用;幂律图;线程散度;启发式算法;指令集;内核;并行处理;稀疏矩阵;标准;向量; ACSR; CSR; GPU; HYB;病毒;

相似文献

外文文献
中文文献
专利

1. Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs [J] . Abdelfattah Ahmad, Ltaief Hatem, Keyes David, Concurrency and computation: practice and experience . 2016,第12期

机译：使用GPU对基于PDE的多组件应用的稀疏矩阵矢量乘法的性能优化
2. GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication [J] . Yuan Tao, Yangdong Deng, Shuai Mu, Concurrency and computation: practice and experience . 2015,第14期

机译：GPU加速的稀疏矩阵-向量乘法和稀疏矩阵-转置向量乘法
3. A Family of Bit-Representation-Optimized Formats for Fast Sparse Matrix-Vector Multiplication on the GPU [J] . Tang Wai Teng, Tan Wen Jun, Goh Rick Siow Mong, Parallel and Distributed Systems, IEEE Transactions on . 2015,第9期

机译：GPU上用于快速稀疏矩阵矢量乘法的一系列位表示优化格式
4. Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining [C] . Xintian Yang, Srinivasan Parthasarathy, P. Sadayappan VLDB 2011;International conference on very large data bases . 2012

机译：GPU上的快速稀疏矩阵-向量乘法：对图挖掘的启示
5. Analysis of High Performance Sparse Matrix-Vector Multiplication for Small Finite Fields [D] . Lambert, Matthew A. 2020

机译：小型有限字段高性能稀疏矩阵矢量乘法分析
6. A Fast Spatial Clustering Method for Sparse LiDAR Point Clouds Using GPU Programming [O] . Yifei Tian, Wei Song, Long Chen, 2020

机译：使用GPU编程的稀疏LiDAR点云的快速空间聚类方法
7. Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs) [O] . Sarah AlAhmadi, Thaha Mohammed, Aiiad Albeshri, 2020

机译：稀疏矩阵矢量乘法（SPMV）对图形处理单元（GPU）的性能分析

Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications

摘要

著录项

相似文献

相关主题

期刊订阅