Communication-Avoiding Cholesky-QR2 for Rectangular Matrices

机译：矩形矩阵的避免通信的Cholesky-QR2

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Scalable QR factorization algorithms for solving least squares and eigenvalue problems are critical given the increasing parallelism within modern machines. We introduce a more general parallelization of the CholeskyQR2 algorithm and show its effectiveness for a wide range of matrix sizes. Our algorithm executes over a 3D processor grid, the dimensions of which can be tuned to trade-off costs in synchronization, interprocessor communication, computational work, and memory footprint. We implement this algorithm, yielding a code that can achieve a factor of Θ(P^1/6) less interprocessor communication on P processors than any previous parallel QR implementation. Our performance study on Intel Knights-Landing and Cray XE supercomputers demonstrates the effectiveness of this CholeskyQR2 parallelization on a large number of nodes. Specifically, relative to ScaLAPACK's QR, on 1024 nodes of Stampede2, our CholeskyQR2 implementation is faster by 2.6x-3.3x in strong scaling tests and by 1.1x-1.9x in weak scaling tests.

机译：鉴于现代机器内部并行性的不断提高，用于解决最小二乘和特征值问题的可伸缩QR分解算法至关重要。我们介绍了CholeskyQR2算法的更一般的并行化，并展示了其对各种矩阵尺寸的有效性。我们的算法在3D处理器网格上执行，该网格的尺寸可以调整到在同步，处理器间通信，计算工作和内存占用方面的权衡成本。我们实现了该算法，所生成的代码与以前的任何并行QR实现相比，可以在P处理器上实现少Θ（P ^ 1/6）的处理器间通信。我们对Intel Knights-Landing和Cray XE超级计算机的性能研究证明了CholeskyQR2并行化在大量节点上的有效性。具体而言，相对于ScaLAPACK的QR，在Stampede2的1024个节点上，我们的CholeskyQR2实现在强扩展测试中快2.6x-3.3x，在弱扩展测试中快1.1x-1.9x。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium》|2019年|89-100|共12页
会议地点
作者
Edward Hutter; Edgar Solomonik;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
QR factorization; Communication avoiding algorithm; Mathematical software;

机译：QR分解通讯避免算法数学软件;

相似文献

外文文献
中文文献
专利

1. On integer matrices and incidence matrices of certain combinatorial configurations. II. Rectangular matrices [J] . Kulendra N. Majindar Canadian Journal of Mathematics . 1966,第1966期

机译：关于某些组合配置的整数矩阵和入射矩阵。二。矩形矩阵
2. On integer matrices and incidence matrices of certain combinatorial configurations. III. Rectangular matrices [J] . Kulendra N. Majindar Canadian Journal of Mathematics . 1966,第1966期

机译：关于某些组合配置的整数矩阵和入射矩阵。三，矩形矩阵
3. Higher order direction finding from rectangular cumulant matrices: The rectangular 2q-MUSIC algorithms [J] . Hanna Becker, Pascal Chevalier, Martin Haardt Signal processing . 2017,第APRa期

机译：矩形累积量矩阵的高阶方向寻找：矩形2q-MUSIC算法
4. Communication-Avoiding Cholesky-QR2 for Rectangular Matrices [C] . Edward Hutter, Edgar Solomonik IEEE International Parallel and Distributed Processing Symposium . 2019

机译：用于矩形矩阵的通信 - 避免Cholesky-QR2
5. Communication-avoiding Krylov subspace methods. [D] . Hoemmen, Mark. 2010

机译：避免通信的Krylov子空间方法。
6. Anterior Cruciate Ligament Reconstruction With Bone–Patellar Tendon–Bone Graft Through a Rectangular Bone Tunnel Made With a Rectangular Retro-dilator: An Operative Technique [O] . Hiroteru Hayashi, Daisaburo Kurosaka, Mitsuru Saito, 2017

机译：骨-tell腱-骨移植通过矩形后路扩张器制成的矩形骨隧道重建前交叉韧带：一种手术技术
7. Communication-Avoiding Cholesky-QR2 for Rectangular Matrices [O] . Edward Hutter, Edgar Solomonik 2019

机译：用于矩形矩阵的通信 - 避免Cholesky-QR2

Communication-Avoiding Cholesky-QR2 for Rectangular Matrices

摘要

著录项

相似文献

相关主题

期刊订阅