Algorithm 980: Sparse QR Factorization on the GPU

Yeralan Sencer Nuri; Davis Timothy A.; Sid-Lakhdar Wissam M.; Ranka Sanjay

首页> 外文期刊>ACM transactions on mathematical software >Algorithm 980: Sparse QR Factorization on the GPU

【24h】

Algorithm 980: Sparse QR Factorization on the GPU

机译：算法980：GPU上的稀疏QR因式分解

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Sparse matrix factorization involves a mix of regular and irregular computation, which is a particular challenge when trying to obtain high-performance on the highly parallel general-purpose computing cores available on graphics processing units (GPUs). We present a sparse multifrontal QR factorization method that meets this challenge and is significantly faster than a highly optimized method on a multicore CPU. Our method factorizes many frontal matrices in parallel and keeps all the data transmitted between frontal matrices on the GPU. A novel bucket scheduler algorithm extends the communication-avoiding QR factorization for dense matrices by exploiting more parallelism and by exploiting the staircase form present in the frontal matrices of a sparse multifrontal method.

机译：稀疏矩阵分解涉及常规计算和不规则计算的混合，这在尝试在图形处理单元（GPU）上可用的高度并行的通用计算核心上获得高性能时尤其困难。我们提出了一种稀疏的多面QR分解方法，该方法可以应对这一挑战，并且比多核CPU上高度优化的方法要快得多。我们的方法并行分解许多正面矩阵，并在GPU上保持正面矩阵之间传输的所有数据。一种新颖的存储桶调度程序算法，通过利用更多的并行性并利用稀疏多边方法的额矩阵中存在的阶梯形式，扩展了对稠密矩阵的避免通信QR分解。

著录项

来源
《ACM transactions on mathematical software》 |2017年第2期|17.1-17.29|共29页
作者
Yeralan Sencer Nuri; Davis Timothy A.; Sid-Lakhdar Wissam M.; Ranka Sanjay;
展开▼
作者单位

Univ Florida, Gainesville, FL 32611 USA;

Texas A&M Univ, Dept Comp Sci & Engn, 3112 TAMU, College Stn, TX 77843 USA;

Texas A&M Univ, Dept Comp Sci & Engn, 3112 TAMU, College Stn, TX 77843 USA;

Univ Florida, Dept Comp & Informat Sci & Engn, E301 CSE Bldg,POB 116120, Gainesville, FL 32611 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Algorithms; Experimentation; Performance; QR factorization; least-square problems; sparse matrices; GPU;

机译：算法;实验;性能;QR分解;最小二乘问题;稀疏矩阵;GPU;

相似文献

外文文献
中文文献
专利

1. Implementing QR factorization updating algorithms on GPUs [J] . Andrew Robert, Dingle Nicholas Parallel Computing . 2014,第7期

机译：在GPU上实现QR分解更新算法
2. Algorithm 915, SuiteSparseQR: Multifrontal Multithreaded Rank-Revealing Sparse QR Factorization [J] . TIMOTHY A. DAVIS ACM transactions on mathematical software . 2012,第1期

机译：算法915，SuiteSparseQR：多正面多线程秩揭示稀疏QR因式分解
3. a coarse-grained parallel QR-factorization algorithm for sparse least squares problems [J] . Tz.Ostromsky, P.C.Hansen, Z.Zlatev Parallel Computing . 1998,第5a6期

机译：稀疏最小二乘问题的粗粒度并行QR分解算法
4. A GPU-Accelerated SVD Algorithm, Based on QR Factorization and Givens Rotations, for DWI Denoising [C] . Livia Marcellino, Guglielmo Navarra International Conference on Signal Image Technology Internet Based Systems . 2016

机译：基于QR分解和Givens旋转的GPU加速SVD算法用于DWI去噪
5. Architecture-Aware Algorithm Design of Sparse Tensor/Matrix Primitives for GPUs [D] . Nisa, Israt J. 2019

机译：GPU稀疏张量/矩阵基元的体系结构感知算法设计
6. Fast and efficient fully 3D PET image reconstruction using sparse system matrix factorization with GPU acceleration [O] . Jian Zhou, Jinyi Qi -1

机译：使用具有GpU加速稀疏系统矩阵分解快速高效的全3D pET图像重建
7. Implementing QR Factorization Updating Algorithms on GPUs [O] . Andrew, Robert, Dingle, Nicholas J. 2014

机译：在GPU上实现QR分解更新算法
8. Tight and explicit representation of Q in sparse QR factorization [R] . Ng, E. G., Peyton, B. W. 1992

机译：稀疏QR分解中Q的紧密和明确表示

Algorithm 980: Sparse QR Factorization on the GPU

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅