A Novel Multi-GPU Parallel Optimization Model for The Sparse Matrix-Vector Multiplication

Jiaquan Gao; Yuanshen Zhou; Kesong Wu

首页> 外文期刊>Parallel Processing Letters >A Novel Multi-GPU Parallel Optimization Model for The Sparse Matrix-Vector Multiplication

【24h】

A Novel Multi-GPU Parallel Optimization Model for The Sparse Matrix-Vector Multiplication

机译：稀疏矩阵-向量乘法的新型多GPU并行优化模型

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Accelerating the sparse matrix-vector multiplication (SpMV) on the graphics processing units (GPUs) has attracted considerable attention recently. We observe that on a specific multiple-GPU platform, the SpMV performance can usually be greatly improved when a matrix is partitioned into several blocks according to a predetermined rule and each block is assigned to a GPU with an appropriate storage format. This motivates us to propose a novel multi-GPU parallel SpMV optimization model. Our model involves two stages. In the first stage, a simple rule is defined to divide any given matrix among multiple GPUs, and then a performance model, which is independent of the problems and dependent on the resources of devices, is proposed to accurately predict the execution time of SpMV kernels. Using these models, we construct in the second stage an optimally multi-GPU parallel SpMV algorithm that is automatically and rapidly generated for the platform for any problem. Given that our model for SpMV is general, independent of the problems, and dependent on the resources of devices, this model is constructed only once for each type of GPU. The experiments validate the high efficiency of our proposed model.

机译：最近，在图形处理单元（GPU）上加速稀疏矩阵矢量乘法（SpMV）引起了相当大的关注。我们观察到，在特定的多GPU平台上，将矩阵根据预定规则划分为多个块并将每个块分配给具有适当存储格式的GPU时，通常可以大大提高SpMV性能。这激励我们提出一种新颖的多GPU并行SpMV优化模型。我们的模型涉及两个阶段。在第一阶段，定义一个简单的规则以在多个GPU之间划分任何给定的矩阵，然后提出一个独立于问题并依赖于设备资源的性能模型，以准确预测SpMV内核的执行时间。使用这些模型，我们在第二阶段构造了最优的多GPU并行SpMV算法，该算法可针对任何问题自动，快速地为平台生成。鉴于我们针对SpMV的模型是通用的，与问题无关，并且取决于设备的资源，因此对于每种类型的GPU，该模型仅构建一次。实验验证了我们提出的模型的高效率。

著录项

来源
《Parallel Processing Letters》 |2016年第4期|共17页
作者
Jiaquan Gao; Yuanshen Zhou; Kesong Wu;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词
Sparse matrix-vector multiplication; Optimization model; Multi-GPUs; CUDA;

机译：稀疏矩阵-向量乘法;优化模型;多GPU;CUDA;

相似文献

外文文献
中文文献
专利

1. A Novel Multi-GPU Parallel Optimization Model for The Sparse Matrix-Vector Multiplication [J] . Jiaquan Gao, Yuanshen Zhou, Kesong Wu Parallel Processing Letters . 2016,第4期

机译：稀疏矩阵-向量乘法的新型多GPU并行优化模型
2. Locality-Aware Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication on Many-Core Processors [J] . M. Ozan Karsavuran, Kadir Akbudak, Cevdet Aykanat IEEE Transactions on Parallel and Distributed Systems . 2016,第6期

机译：多核处理器上的局部性并行稀疏矩阵向量和矩阵转置向量乘法
3. High-Level Strategies for Parallel Shared-Memory Sparse Matrix-Vector Multiplication [J] . Yzelman Albert-Jan Nicholas, Roose Dirk IEEE Transactions on Parallel and Distributed Systems . 2014,第1期

机译：并行共享内存稀疏矩阵矢量乘法的高级策略
4. Multi-GPU implementation and performance optimization for CSR-based sparse matrix-vector multiplication [C] . Ping Guo, Changjiang Zhang IEEE International Conference on Computer and Communications . 2017

机译：基于CSR的稀疏矩阵矢量乘法的多GPU实现和性能优化
5. Analysis of High Performance Sparse Matrix-Vector Multiplication for Small Finite Fields [D] . Lambert, Matthew A. 2020

机译：小型有限字段高性能稀疏矩阵矢量乘法分析
6. Multi-GPU Based Parallel Design of the Ant Colony Optimization Algorithm for Endmember Extraction from Hyperspectral Images [O] . Jianwei Gao, Yi Sun, Bing Zhang, 2019

机译：基于多GPU的蚁群优化算法从高光谱图像中提取末端成员的并行设计
7. Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform [O] . Shiming Xu, Wei Xue, Hai Xiang Lin 2011

机译：NVIDIA CUDA平台上稀疏矩阵矢量乘法的性能建模和优化

A Novel Multi-GPU Parallel Optimization Model for The Sparse Matrix-Vector Multiplication

摘要

著录项

相似文献

相关主题

期刊订阅