High-Level Strategies for Parallel Shared-Memory Sparse Matrix-Vector Multiplication

Yzelman Albert-Jan Nicholas; Roose Dirk

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >High-Level Strategies for Parallel Shared-Memory Sparse Matrix-Vector Multiplication

【24h】

High-Level Strategies for Parallel Shared-Memory Sparse Matrix-Vector Multiplication

机译：并行共享内存稀疏矩阵矢量乘法的高级策略

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The sparse matrix-vector multiplication is an important computational kernel, but is hard to efficiently execute even in the sequential case. The problems--namely low arithmetic intensity, inefficient cache use, and limited memory bandwidth--are magnified as the core count on shared-memory parallel architectures increases. Existing techniques are discussed in detail, and categorized chiefly based on their distribution types. Based on this, new parallelization techniques are proposed. The theoretical scalability and memory usage of the various strategies are analyzed, and experiments on multiple NUMA architectures confirm the validity of the results. One of the newly proposed methods attains the best average result in experiments on a large set of matrices. In one of the experiments it obtains a parallel efficiency of 90 percent, while on average it performs close to 60 percent.

机译：稀疏矩阵矢量乘法是重要的计算内核，但即使在顺序情况下也难以有效执行。随着共享内存并行体系结构的核心数量增加，这些问题（即低算术强度，低效的缓存使用和有限的内存带宽）被放大了。现有技术进行了详细讨论，并主要根据其分布类型进行分类。基于此，提出了新的并行化技术。分析了各种策略的理论可扩展性和内存使用情况，并且在多种NUMA架构上进行的实验证实了结果的有效性。新提出的方法之一在大量矩阵上的实验中获得了最佳的平均结果。在其中一项实验中，它的并行效率为90％，而平均效率接近60％。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2014年第1期|116-125|共10页
作者
Yzelman Albert-Jan Nicholas; Roose Dirk;
展开▼
作者单位

Flanders ExaScience Lab (Intel Labs Europe), Leuven|c|;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Hilbert space-filling curve; NUMA architectures; Sparse matrix-vector multiplication; cache-oblivious; high-performance computing; matrix reordering; shared-memory parallelism; sparse matrix partitioning;

机译：希尔伯特空间填充曲线;NUMA架构;稀疏矩阵向量乘法;可忽略缓存;高性能计算;矩阵重排序;共享内存并行性;稀疏矩阵划分;

相似文献

外文文献
中文文献
专利

1. Locality-Aware Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication on Many-Core Processors [J] . M. Ozan Karsavuran, Kadir Akbudak, Cevdet Aykanat IEEE Transactions on Parallel and Distributed Systems . 2016,第6期

机译：多核处理器上的局部性并行稀疏矩阵向量和矩阵转置向量乘法
2. A Novel Multi-GPU Parallel Optimization Model for The Sparse Matrix-Vector Multiplication [J] . Jiaquan Gao, Yuanshen Zhou, Kesong Wu Parallel Processing Letters . 2016,第4期

机译：稀疏矩阵-向量乘法的新型多GPU并行优化模型
3. Hybrid-Parallel Sparse Matrix-Vector Multiplication and Iterative Linear Solvers with the communication library GPI [J] . Dimitar Stoyanov, Franz-Josef Pfreundt WSEAS Transactions on Information Science and Applications . 2014,第Null期

机译：带有通信库GPI的混合并行稀疏矩阵矢量乘法和迭代线性求解器
4. Optimization of Block Sparse Matrix-Vector Multiplication on Shared-Memory Parallel Architectures [C] . Ryan Eberhardt, Mark Hoemmen IEEE International Parallel and Distributed Processing Symposium Workshops and PhD Forum . 2016

机译：共享内存并行体系结构上的块稀疏矩阵-矢量乘法的优化
5. Analysis of High Performance Sparse Matrix-Vector Multiplication for Small Finite Fields [D] . Lambert, Matthew A. 2020

机译：小型有限字段高性能稀疏矩阵矢量乘法分析
6. HIERARCHICAL ORTHOGONAL MATRIX GENERATION AND MATRIX-VECTOR MULTIPLICATIONS IN RIGID BODY SIMULATIONS [O] . FUHUI FANG, JINGFANG HUANG, GARY HUBER, -1

机译：刚体模拟中的正交正交矩阵生成和矩阵向量乘法
7. High-level strategies for parallel shared-memory sparse matrix-vector multiplication [O] . Yzelman Albert-Jan, Roose Dirk 2012

机译：并行共享内存稀疏矩阵矢量乘法的高级策略

High-Level Strategies for Parallel Shared-Memory Sparse Matrix-Vector Multiplication

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅