Sparse supernodal solver using block low-rank compression: Design, performance and analysis

Grégoire Pichon; Eric Darve; Mathieu Faverge; Pierre Ramet; Jean Roman

首页> 外文期刊>Journal of computational science >Sparse supernodal solver using block low-rank compression: Design, performance and analysis

【24h】

Sparse supernodal solver using block low-rank compression: Design, performance and analysis

机译：使用块低秩压缩的稀疏超节点求解器：设计，性能和分析

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper presents two approaches using a Block Low-Rank (BLR) compression technique to reduce the memory footprint and/or the time-to-solution of the sparse supernodal solverPaStiX. This flat, non-hierarchical, compression method allows to take advantage of the low-rank property of the blocks appearing during the factorization of sparse linear systems, which come from the discretization of partial differential equations. The proposed solver can be used either as a direct solver at a lower precision or as a very robust preconditioner. The first approach, calledMinimal Memory, illustrates the maximum memory gain that can be obtained with the BLR compression method, while the second approach, calledJust-In-Time, mainly focuses on reducing the computational complexity and thus the time-to-solution. Singular Value Decomposition (SVD) and Rank-Revealing QR (RRQR), as compression kernels, are both compared in terms of factorization time, memory consumption, as well as numerical properties. Experiments on a shared memory node with 24 threads and 128 GB of memory are performed to evaluate the potential of both strategies. On a set of matrices from real-life problems, we demonstrate a memory footprint reduction of up to 4 times using theMinimal Memorystrategy and a computational time speedup of up to 3.5 times with theJust-In-Timestrategy. Then, we study the impact of configuration parameters of the BLR solver that allowed us to solve a 3D laplacian of 36 million unknowns a single node, while the full-rank solver stopped at 8 million due to memory limitation.

机译：本文介绍了两种使用块低秩（BLR）压缩技术来减少稀疏超节点求解器PaStiX的内存占用和/或求解时间的方法。这种平坦的，非分层的压缩方法可以利用稀疏线性系统因式分解过程中出现的块的低秩特性，这是由偏微分方程的离散化引起的。所提出的求解器可以用作精度较低的直接求解器，也可以用作非常强大的预处理器。第一种方法称为“最小内存”，它说明了可以使用BLR压缩方法获得的最大内存增益，而第二种方法称为“即时”，主要着重于降低计算复杂度并因此缩短了求解时间。作为压缩内核，对奇异值分解（SVD）和秩揭示QR（RRQR）进行了因子分解时间，内存消耗以及数值属性方面的比较。在具有24个线程和128 GB内存的共享内存节点上进行了实验，以评估这两种策略的潜力。在一系列来自现实生活问题的矩阵上，我们证明了使用最小内存策略最多可将内存占用量减少4倍，而按时策略最多可将计算时间加快3.5倍。然后，我们研究了BLR求解器的配置参数的影响，该参数使我们能够在单个节点上求解3600万个未知数的3D拉普拉斯算子，而由于内存限制，全等级求解器在800万个点处停止。

著录项

来源
《Journal of computational science》 |2018年第7期|255-270|共16页
作者
Grégoire Pichon; Eric Darve; Mathieu Faverge; Pierre Ramet; Jean Roman;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Block-Diagonal Constrained Low-Rank and Sparse Graph for Discriminant Analysis of Image Data [J] . Tan Guo, Xiaoheng Tan, Lei Zhang, Sensors . 2017,第7期

机译：块对角约束低秩稀疏图用于图像数据判别分析
2. Performance analysis of pulse compression using phase-coded signals for sparse-appay synthetic impulse and aperture radar [J] . Chen Baixiao, Zhang Shouhong Journal of Electronics (CHINA) . 1998,第4期

机译：使用稀疏合成脉冲和孔径雷达的相位编码信号进行脉冲压缩的性能分析
3. PERFORMANCE ANALYSIS OF PULSE COMPRESSION USING PHASE-CODED SIGNALS FOR SPARSE-ARRAY SYNTHETIC IMPULSE AND APERTURE RADAR [J] . 电子科学学刊：英文版 . 1998,第004期

机译：稀疏阵列合成脉冲和孔径雷达的相位编码信号脉冲压缩性能分析
4. Sparse Supernodal Solver Using Block Low-Rank Compression [C] . Gregoire Pichon, Eric Darve, Mathieu Faverge, IEEE International Parallel and Distributed Processing Symposium Workshops . 2017

机译：使用块低秩压缩的稀疏超节点解算器
5. A Non-Blocking Design Paradigm for WDM Mesh Backbone Networks and its Performance Analysis [D] . Ma, Xiao 2018

机译：WDM网状骨干网的无阻塞设计范例及其性能分析
6. Block-Diagonal Constrained Low-Rank and Sparse Graph for Discriminant Analysis of Image Data [O] . Tan Guo, Xiaoheng Tan, Lei Zhang, 2017

机译：块对角约束低秩稀疏图用于图像数据判别分析
7. Sparse Supernodal Solver Using Block Low-Rank Compression: design, performance and analysis [O] . Pichon, Grégoire, Darve, Eric, Faverge, Mathieu, 2017

机译：使用块低秩压缩的稀疏超节点解算器：设计，性能和分析
8. Performance of a Supernodal General Sparse Solver on the CRAY Y-MP: 1.68 GFLOPS with Autotasking [R] . Simon, Horst D., Vu, Phuong, Yang, Chao 1989

机译：在CRaY Y-mp上的超模通用稀疏求解器的性能：1.68 GFLOps与自动任务

Sparse supernodal solver using block low-rank compression: Design, performance and analysis

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅