首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Highly scalable parallel algorithms for sparse matrix factorization
【24h】

Highly scalable parallel algorithms for sparse matrix factorization

机译:高度可扩展的并行算法,用于稀疏矩阵分解

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we describe scalable parallel algorithms for symmetric sparse matrix factorization, analyze their performance and scalability, and present experimental results for up to 1,024 processors on a Gray T3D parallel computer. Through our analysis and experimental results, we demonstrate that our algorithms substantially improve the state of the art in parallel direct solution of sparse linear systems-both in terms of scalability and overall performance. It is a well known fact that dense matrix factorization scales well and can be implemented efficiently on parallel computers. In this paper, we present the first algorithms to factor a wide class of sparse matrices (including those arising from two- and three-dimensional finite element problems) that are asymptotically as scalable as dense matrix factorization algorithms on a variety of parallel architectures. Our algorithms incur less communication overhead and are more scalable than any previously known parallel formulation of sparse matrix factorization. Although, in this paper, we discuss Cholesky factorization of symmetric positive definite matrices, the algorithms can be adapted for solving sparse linear least squares problems and for Gaussian elimination of diagonally dominant matrices that are almost symmetric in structure. An implementation of one of our sparse Cholesky factorization algorithms delivers up to 20 GFlops on a Gray T3D for medium-size structural engineering and linear programming problems. To the best of our knowledge, this is the highest performance ever obtained for sparse Cholesky factorization on any supercomputer.
机译:在本文中,我们描述了用于对称稀疏矩阵分解的可伸缩并行算法,分析了它们的性能和可伸缩性,并在Gray T3D并行计算机上提供了多达1,024个处理器的实验结果。通过我们的分析和实验结果,我们证明了在稀疏线性系统的并行直接解决方案中,我们的算法在可伸缩性和总体性能方面都大大改善了现有技术。众所周知的事实是,密矩阵分解很好地缩放,并且可以在并行计算机上有效地实现。在本文中,我们提出了第一个算法来分解各种稀疏矩阵(包括那些由二维和三维有限元问题引起的稀疏矩阵),这些稀疏矩阵在各种并行体系结构上都可以像密矩阵分解算法那样渐近地扩展。与稀疏矩阵分解的任何先前已知并行公式相比,我们的算法产生的通信开销更少,并且可伸缩性更高。尽管在本文中,我们讨论了对称正定矩阵的Cholesky分解,但该算法可适用于解决稀疏线性最小二乘问题和结构几乎对称的对角占优矩阵的高斯消除。我们的一种稀疏的Cholesky因式分解算法的实现可在Gray T3D上提供多达20个GFlop,用于中等大小的结构工程和线性编程问题。据我们所知,这是任何超级计算机上稀疏的Cholesky因式分解所获得的最高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号