首页> 外文期刊>Journal of computational science >Sparse supernodal solver using block low-rank compression: Design, performance and analysis
【24h】

Sparse supernodal solver using block low-rank compression: Design, performance and analysis

机译:使用块低秩压缩的稀疏超节点求解器:设计,性能和分析

获取原文
获取原文并翻译 | 示例

摘要

This paper presents two approaches using a Block Low-Rank (BLR) compression technique to reduce the memory footprint and/or the time-to-solution of the sparse supernodal solverPaStiX. This flat, non-hierarchical, compression method allows to take advantage of the low-rank property of the blocks appearing during the factorization of sparse linear systems, which come from the discretization of partial differential equations. The proposed solver can be used either as a direct solver at a lower precision or as a very robust preconditioner. The first approach, calledMinimal Memory, illustrates the maximum memory gain that can be obtained with the BLR compression method, while the second approach, calledJust-In-Time, mainly focuses on reducing the computational complexity and thus the time-to-solution. Singular Value Decomposition (SVD) and Rank-Revealing QR (RRQR), as compression kernels, are both compared in terms of factorization time, memory consumption, as well as numerical properties. Experiments on a shared memory node with 24 threads and 128 GB of memory are performed to evaluate the potential of both strategies. On a set of matrices from real-life problems, we demonstrate a memory footprint reduction of up to 4 times using theMinimal Memorystrategy and a computational time speedup of up to 3.5 times with theJust-In-Timestrategy. Then, we study the impact of configuration parameters of the BLR solver that allowed us to solve a 3D laplacian of 36 million unknowns a single node, while the full-rank solver stopped at 8 million due to memory limitation.
机译:本文介绍了两种使用块低秩(BLR)压缩技术来减少稀疏超节点求解器PaStiX的内存占用和/或求解时间的方法。这种平坦的,非分层的压缩方法可以利用稀疏线性系统因式分解过程中出现的块的低秩特性,这是由偏微分方程的离散化引起的。所提出的求解器可以用作精度较低的直接求解器,也可以用作非常强大的预处理器。第一种方法称为“最小内存”,它说明了可以使用BLR压缩方法获得的最大内存增益,而第二种方法称为“即时”,主要着重于降低计算复杂度并因此缩短了求解时间。作为压缩内核,对奇异值分解(SVD)和秩揭示QR(RRQR)进行了因子分解时间,内存消耗以及数值属性方面的比较。在具有24个线程和128 GB内存的共享内存节点上进行了实验,以评估这两种策略的潜力。在一系列来自现实生活问题的矩阵上,我们证明了使用最小内存策略最多可将内存占用量减少4倍,而按时策略最多可将计算时间加快3.5倍。然后,我们研究了BLR求解器的配置参数的影响,该参数使我们能够在单个节点上求解3600万个未知数的3D拉普拉斯算子,而由于内存限制,全等级求解器在800万个点处停止。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号