首页> 外文期刊>ACM transactions on mathematical software >Level-3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms
【24h】

Level-3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms

机译:3级Cholesky分解例程可提高许多Cholesky算法的性能

获取原文
获取原文并翻译 | 示例
           

摘要

Four routines called DPOTF3i, i = a,b,c,d, are presented. DPOTF3i are a novel type of level-3 BLAS for use by BPF (Blocked Packed Format) Cholesky factorization and LAPACK routine DPOTRF. Performance of routines DP0TF3i are still increasing when the performance of Level-2 routine DPOTF2 of LAPACK starts decreasing. This is our main result and it implies, due to the use of larger block size nb, that DGEMM, DSYRK, and DTRSM performance also increases! The four DPOTF3i routines use simple register blocking. Different platforms have different numbers of registers. Thus, our four routines have different register blocking sizes.
机译:给出了四个称为DPOTF3i的例程,即i = a,b,c,d。 DPOTF3i是一种新型的3级BLAS,供BPF(分组压缩格式)Cholesky分解和LAPACK例程DPOTRF使用。当LAPACK的2级例程DPOTF2的性能开始下降时,例程DP0TF3i的性能仍在提高。这是我们的主要结果,这意味着,由于使用了较大的块大小nb,DGEMM,DSYRK和DTRSM的性能也会提高!四个DPOTF3i例程使用简单的寄存器阻塞。不同的平台具有不同数量的寄存器。因此,我们的四个例程具有不同的寄存器块大小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号