Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks

机译：使用压缩稀疏块的并行稀疏矩阵向量和矩阵转置向量乘法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper introduces a storage format for sparse matrices, called compressed sparse blocks (CSB), which allows both Ax and A,x to be computed efficiently in parallel, where A is an n×n sparse matrix with nnzen nonzeros and x is a dense n-vector. Our algorithms use Θ(nnz) work (serial running time) and Θ(√nlgn) span (critical-path length), yielding a parallelism of Θ(nnz/√nlgn), which is amply high for virtually any large matrix. The storage requirement for CSB is the same as that for the more-standard compressed-sparse-rows (CSR) format, for which computing Ax in parallel is easy but A,x is difficult. Benchmark results indicate that on one processor, the CSB algorithms for Ax and A,x run just as fast as the CSR algorithm for Ax, but the CSB algorithms also scale up linearly with processorsuntil limited by off-chip memory bandwidth.

机译：本文介绍了一种稀疏矩阵的存储格式，称为压缩稀疏块（CSB），它允许有效地并行计算Ax和A，x，其中A是nnn个非零的n×n稀疏矩阵，x是一个稠密的正向量。我们的算法使用Θ（nnz）功（串行运行时间）和Θ（√nlgn）跨度（关键路径长度），产生Θ（nnz /√nlgn）的并行度，对于几乎任何大型矩阵而言，该并行度都很高。 CSB的存储要求与更标准的压缩稀疏行（CSR）格式的存储要求相同，对于这种格式，并行计算Ax很容易，但是A，x却很困难。基准测试结果表明，在一个处理器上，用于Ax和A，x的CSB算法的运行速度与用于Ax的CSR算法的运行速度一样快，但是CSB算法也随着处理器的扩展而线性扩展，直到受到片外存储器带宽的限制。

著录项

来源
《21st annual symposium on parallelism in algorithms and architectures 2009》|2009年|P.233 - 244|共12页
会议地点
作者
Aydin Buluc; Jeremy T. Fineman; Matteo Frigo; John R. Gilbert; Charles E. Leiserson;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词
compressed sparse blocks; compressed sparse columns; compressed sparse rows; matrix transpose; matrix-vector multiplication; multithreaded algorithm; parallelism; span; sparse matrix; storage format; work;

机译：压缩稀疏块压缩稀疏列压缩稀疏行矩阵转置矩阵向量乘法多线程算法并行度跨度稀疏矩阵存储格式工作;

相似文献

外文文献
中文文献
专利

1. Locality-Aware Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication on Many-Core Processors [J] . M. Ozan Karsavuran, Kadir Akbudak, Cevdet Aykanat IEEE Transactions on Parallel and Distributed Systems . 2016,第6期

机译：多核处理器上的局部性并行稀疏矩阵向量和矩阵转置向量乘法
2. Blocked-Based Sparse Matrix-Vector Multiplication on Distributed Memory Parallel Computers [J] . Rukhsana Shahnaz, Anila Usman The international arab journal of information technology . 2011,第2期

机译：分布式内存并行计算机上的基于块的稀疏矩阵矢量乘法
3. Efficient multithreaded untransposed, transposed or symmetric sparse matrix-vector multiplication with the Recursive Sparse Blocks format [J] . Martone Michele Parallel Computing . 2014,第7期

机译：递归稀疏块格式的高效多线程未转置，转置或对称稀疏矩阵矢量乘法
4. Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication Using Compressed Sparse Blocks [C] . Aydm Bulug, John R. Gilbert, Jeremy T. Finemant, 21st annual symposium on parallelism in algorithms and architectures 2009 . 2009

机译：使用压缩稀疏块的并行稀疏矩阵向量和矩阵转置向量乘法
5. Analysis of High Performance Sparse Matrix-Vector Multiplication for Small Finite Fields [D] . Lambert, Matthew A. 2020

机译：小型有限字段高性能稀疏矩阵矢量乘法分析
6. Motion-compensated compressed sensing for dynamic contrast-enhanced MRI using regional spatiotemporal sparsity and region tracking: Block LOw-rank Sparsity with Motion-guidance (BLOSM) [O] . Xiao Chen, Michael Salerno, Yang Yang, -1

机译：使用区域时空稀疏性和区域跟踪的动态对比度增强MRI的运动补偿压缩感测：具有运动引导（BLOSM）的块低位稀疏性
7. Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks [O] . Aydın Buluç, Jeremy T. Fineman, Matteo Frigo, 2009

机译：使用压缩稀疏块的并行稀疏矩阵向量和矩阵转置向量乘法

Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks

摘要

著录项

相似文献

相关主题

期刊订阅