首页> 外文会议>ACM/IEEE conference on Supercomputing >Parallel threshold-based ILU factorization
【24h】

Parallel threshold-based ILU factorization

机译:基于并行阈值的ILU分解

获取原文

摘要

The sparse linear systems arising in finite element applications are commonly solved using iterative methods. In particular, as the size of these problems increases, the increased computational and memory requirements of these problems render in-core direct solution methods unusable, leaving iterative methods as the only viable alternative for solving these problems in core.The major computational kernels of an iterative method are (i) computation of preconditioner, (ii) multiplication of a sparse matrix with a vector, and (iii) application of the preconditioner. Threshold-based incomplete LU factorization have been found to be quite effective in preconditioning iterative system solvers [14]. However, because these factorizations allow the fill elements to be created dynamically, their parallel formulations had not been well understood, and they have been considered to be unsuitable for distributed-memory parallel computers [13]. Furthermore, solution of the resulting sparse triangular system (which is required for the application of the preconditioner) is generally more difficult to parallelize than the multiplication of a sparse matrix with a vector.In this paper we show that highly parallel graph partitioning algorithms in conjunction with parallel algorithms for computing maximal independent sets can be used to develop scalable parallel formulations of incomplete factorization algorithms. We present a highly parallel formulation of the ILUT factorization algorithm [14] for distributed memory parallel computers. This algorithm uses our parallel multilevel k-way graph partitioning algorithm [6,8] in conjunction with a parallel maximal independent subset algorithm to parallelize both the factorization as well as the solution of the resulting triangular factors. We also present a modified ILUT factorization algorithm (ILUT*) that requires less time and is more scalable than ILUT. Our experiments on Cray T3D show that our parallel ILUT* algorithm achieve a high degree of concurrency, and when used as a preconditioner, it is comparable in quality to the unmodified ILUT algorithm. Furthermore, our experiments using the GMRES iterative solver show that the amount of time spent in computing the factorization using the ILUT* algorithm is usually much less than the amount of time required to solve the systems.
机译:通常使用迭代方法解决有限元应用中出现的稀疏线性系统。特别是,随着这些问题规模的增大,这些问题的不断增长的计算和内存需求使得内核内直接解决方法无法使用,从而使迭代方法成为解决内核中这些问题的唯一可行选择。迭代方法是(i)预处理器的计算,(ii)稀疏矩阵与矢量的相乘,以及(iii)预处理器的应用。已经发现基于阈值的不完全LU分解在预处理迭代系统求解器中非常有效[14]。但是,由于这些分解可以动态创建填充元素,因此对其并行表示还没有很好的理解,因此认为它们不适用于分布式内存并行计算机[13]。此外,与将稀疏矩阵与向量相乘相比,所得稀疏三角系统(应用预处理器所需的)的解决方案通常更难并行化。带有用于计算最大独立集的并行算法的算法可用于开发不完整分解算法的可扩展并行公式。我们为分布式内存并行计算机提出了ILUT因数分解算法[14]的高度并行表示。该算法使用我们的并行多级 k 方向图分区算法[6,8]以及并行的最大独立子集算法来并行化分解和生成的三角因子的求解。我们还提出了一种经过改进的ILUT分解算法(ILUT *),该算法比ILUT所需的时间更少且可扩展性更高。我们在Cray T3D上进行的实验表明,我们的并行ILUT *算法实现了高度的并发性,并且在用作预处理器时,其质量与未经修改的ILUT算法相当。此外,我们使用GMRES迭代求解器进行的实验表明,使用ILUT *算法计算因式分解所花费的时间通常比解决系统所需的时间少得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号