首页> 外文期刊>Parallel Computing >On iterative QR pre-processing in the parallel block-Jacobi SVD algorithm
【24h】

On iterative QR pre-processing in the parallel block-Jacobi SVD algorithm

机译:并行块-Jacobi SVD算法中的迭代QR预处理

获取原文
获取原文并翻译 | 示例

摘要

An efficient version of the parallel two-sided block-Jacobi algorithm for the singular value decomposition of an m × n matrix A includes the pre-processing step, which consists of the QR factorization of A with column pivoting followed by the optional LQ factorization of the R-factor. Then the iterative two-sided block-Jacobi algorithm is applied in parallel to the R-factor (or L-factor). For the efficient computation of the parallel QR (or LQ) factorization with (or without) column pivoting implemented in the ScaLAPACK, some matrix block cyclic distribution on a process grid r xc with p = r × c, r. c ≥ 1, and block size n_b × n_b is required so that all processors remain busy during the whole parallel QR (or LQ) factorization. Optimal values for parameters r, c and n_b are estimated experimentally using matrices of order n = 4000 and 8000, and the number of processors p = 8 and 16, respectively. It turns out that the optimal values are about n_b = 100 and r ≤ c with both r, c near to p~(1/2). These parameters are then used in numerical experiments for six various distributions of singular values combined with well- (k = 10~1) and ill-conditioned matrices (k = 10~8). It is shown that using optimal parameters in the pre-processing step, the parallel two-sided block-Jacobi SVD algorithm performs better (or equally well) than the ScaLAPACK routine PDGESVD for matrices with a multiple minimal/maximal singular value regardless to the condition number. For other distributions of singular values, our algorithm is slower than the ScaLAPACK. The un-pivoted QRLQ pre-processing step is then re-formulated and extended to the QR iteration, and its connection to the QR algorithm applied to specific symmetric, positive definite matrices is shown. This connection helps to explain observations in another set of experiments with a variable number of QR iteration steps. In general, the best results for all six distributions of singular values are achieved by using about six QR iteration steps in pre-processing.
机译:用于m×n矩阵A的奇异值分解的并行两面并行块雅可比算法的有效版本包括预处理步骤,该步骤包括A的QR分解和列旋转,然后进行可选的LQ分解。 R因子。然后,将并行的双面块雅可比算法并行应用于R因子(或L因子)。为了在ScaLAPACK中实现带(或不带)列旋转的并行QR(或LQ)分解的高效计算,在过程网格r xc上具有p = r×c,r的一些矩阵块循环分布。 c≥1,并且需要块大小n_b×n_b,以便在整个并行QR(或LQ)分解期间,所有处理器都保持繁忙状态。参数r,c和n_b的最佳值分别使用n = 4000和8000阶的矩阵以及处理器数量p = 8和16的实验进行了估计。结果表明,最优值约为n_b = 100且r≤c,而r,c均接近p〜(1/2)。这些参数然后在数值实验中用于奇异值的六种不同分布,并结合良好(k = 10〜1)和病态矩阵(k = 10〜8)。结果表明,在预处理步骤中使用最佳参数,无论条件如何,并行的双面块雅各比SVD算法对于具有多个最小/最大奇异值的矩阵,其性能均优于(或等效)ScaLAPACK例程PDGESVD。数。对于奇异值的其他分布,我们的算法比ScaLAPACK慢。然后,重新计算未旋转的QRLQ预处理步骤,并将其扩展到QR迭代,并显示其与应用于特定对称,正定矩阵的QR算法的联系。这种联系有助于解释另一组具有可变数量QR迭代步骤的实验中的观察结果。通常,通过在预处理中使用大约六个QR迭代步骤,可以实现所有六个奇异值分布的最佳结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号