首页> 外文会议>Calable high performance computing conference >Performance of panel and block approaches to sparse Cholesky factorization on the iPSC/860 and Paragon multicomputers

【24h】

Performance of panel and block approaches to sparse Cholesky factorization on the iPSC/860 and Paragon multicomputers

机译：面板的性能和阻止漏洞凿孔的方法IPSC / 860和Paragon多电脑的分解方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sparse Cholesky factorization has historically achieved extremely low performance on distributed memory multiprocessors. Three issues must be addressed to improve this situation: (1) parallel factorization methods must be based on more efficient sequential methods; (2) parallel machines must provide higher interprocessor communication bandwidth; and (3) the sparse matrices used to evaluate parallel sparse factorization performance should be more representative of the sizes of matrices people would factor on large parallel machines. All of these issues have in fact already been addressed. Specifically: (1) single-node performance can be improved by moving from a column-oriented approach, where the computational kernel is Level 1 BLAS, to either a panel- or block-oriented approach, where the kernel is Level 3 BLAS; (2) communication hardware has improved dramatically, with new parallel computers providing higher communication bandwidth than previous parallel computers; and (3) several larger benchmark matrices are now available, and newer parallel machines offer sufficient memory per node to factor these larger matrices. The result of addressing these three issues is extremely high performance on moderately parallel machines. This paper demonstrates performance levels of 650 double-precision MFLOPS on 32 processors of the Intel Paragon system, 1 GFLOPS on 64 processors, and 1.7 GFLOPS on 128 processors. This paper also does a direct performance comparison between the iPSC/860 and Paragon systems, as well as a comparison between panel- and block-oriented approaches to parallel factorization.

机译：稀疏的Cholesky分解在历史上实现了分布式内存多处理器的极低性能。必须解决三个问题以改善这种情况：（1）并行分解方法必须基于更高效的顺序方法; （2）并联机器必须提供更高的地区通信带宽; （3）用于评估平行稀疏因子分解性能的稀疏矩阵应更代表更代表人们对大型平行机器的尺寸。所有这些问题都已得到解决。具体地：（1）通过从面向列的方法移动，可以提高单节点性能，其中计算内核是1级BLA，以面向面板或面向块的方法，其中内核为3级BLA; （2）通信硬件急剧提高，具有新的并行计算机，提供比前一个并行计算机更高的通信带宽; （3）现在有几种较大的基准矩阵，每个节点提供足够的内存以提供这些更大的矩阵。解决这三个问题的结果在适度平行的机器上具有极高的性能。本文演示了Intel Paragon系统的32个处理器上的650个双精度MFLOPS的性能水平，64个处理器上的1 GFLOPS，128个处理器上的1.7 GFLOPS。本文还可以在IPSC / 860和Paragon系统之间进行直接的性能比较，以及面向平行分解的面向板和面向块的方法之间的比较。

著录项

来源
《Calable high performance computing conference》|1994年||共10页
会议地点
作者
Rothberg E.; Institute of Electric and Electronic Engineer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization [J] . Heejo Lee, Jong Kim, Sung Je Hong, Parallel Computing . 2003,第1期

机译：使用面向块的稀疏Cholesky分解的块依赖DAG的任务调度
2. Minimal-storage high-performance Cholesky factorization via blocking and recursion [J] . IBM Journal of Research and Development . 2000,第6期

机译：通过阻塞和递归实现最少存储的高性能Cholesky因式分解
3. Level-3 Cholesky factorization routines improve performance of many Cholesky algorithms [J] . Charles R. Crawford Computing reviews . 2013,第8期

机译：3级Cholesky因式分解例程可提高许多Cholesky算法的性能
4. Performance of panel and block approaches to sparse Cholesky factorization on the iPSC/860 and Paragon multicomputers [C] . Rothberg, E. . 1994

机译：在iPSC / 860和Paragon多计算机上稀疏进行Cholesky因式分解的面板和块方法的性能
5. Communication-efficient parallel sparse Cholesky factorization. [D] . Eswar, Kalluri. 1995

机译：具有通信效率的并行稀疏Cholesky分解。
6. Sparse domain approaches in dynamic SPECT imaging with high-performance computing [O] . Hui Pan, Haoran Chang, Debasis Mitra, 2017

机译：具有高性能计算的动态SPECT成像中的稀疏域方法
7. Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization [O] . Heejo Lee, Jong Kim, Sung Je Hong, 2003

机译：使用块依赖DAG的任务调度用于面向块的稀疏Cholesky分解
8. A performance study of sparse Cholesky factorization on INTEL iPSC/860 [R] . Zubair, M., Ghose, M. 1992

机译：对INTEL ipsC / 860稀疏Cholesky分解的性能研究

Performance of panel and block approaches to sparse Cholesky factorization on the iPSC/860 and Paragon multicomputers

摘要

著录项

相似文献

相关主题

期刊订阅