...
首页> 外文期刊>ACM transactions on mathematical software >Rectangular Full Packed Format for Cholesky's Algorithm: Factorization,Solution, and Inversion
【24h】

Rectangular Full Packed Format for Cholesky's Algorithm: Factorization,Solution, and Inversion

机译:Cholesky算法的矩形全压缩格式:分解,求解和求逆

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We describe a new data format for storing triangular, symmetric, and Hermitian matrices called Rectangular Full Packed Format (RFPF). The standard two-dimensional arrays of Fortran and C (also known as full format) that are used to represent triangular and symmetric matrices waste nearly half of the storage space but provide high performance via the use of Level 3 BLAS. Standard packed format arrays fully utilize storage (array space) but provide low performance as there is no Level 3 packed BLAS. We combine the good features of packed and full storage using RFPF to obtain high performance via using Level 3 BLAS as RFPF is a standard full-format representation. Also, RFPF requires exactly the same minimal storage as packed the format. Each LAPACK full and/or packed triangular, symmetric, and Hermitian routine becomes a single new RFPF routine based on eight possible data layouts of RFPF. This new RFPF routine usually consists of two calls to the corresponding LAPACK full-format routine and two calls to Level 3 BLAS routines. This means no new software is required. As examples, we present LAPACK routines for Cholesky factorization, Cholesky solution, and Cholesky inverse computation in RFPF to illustrate this new work and to describe its performance on several commonly used computer platforms. Performance of LAPACK full routines using RFPF versus LAPACK full routines using the standard format for both serial and SMP parallel processing is about the same while using half the storage. Performance gains are roughly one to a factor of 43 for serial and one to a factor of 97 for SMP parallel times faster using vendor LAPACK full routines with RFPF than with using vendor and/or reference packed routines.
机译:我们描述了一种用于存储三角形,对称和埃尔米特矩阵的新数据格式,称为矩形完全压缩格式(RFPF)。用于表示三角形和对称矩阵的标准Fortran和C二维数组(也称为完整格式)浪费了近一半的存储空间,但通过使用Level 3 BLAS提供了高性能。标准打包格式阵列完全利用存储空间(阵列空间),但由于没有3级打包BLAS,所以性能低下。由于RFPF是标准的全格式表示形式,因此我们结合使用RFPF的打包存储和完整存储的良好功能,通过使用Level 3 BLAS获得高性能。此外,RFPF需要与打包格式完全相同的最小存储量。基于RFPF的八种可能的数据布局,每个LAPACK完整和/或压缩的三角形,对称和Hermitian例程将成为一个新的RFPF例程。这个新的RFPF例程通常包括对相应LAPACK全格式例程的两次调用和对3级BLAS例程的两次调用。这意味着不需要任何新软件。作为示例,我们介绍了RFPF中用于Cholesky因式分解,Cholesky解和Cholesky逆计算的LAPACK例程,以说明这项新工作并描述其在几种常用计算机平台上的性能。使用RFPF的LAPACK完整例程的性能与使用标准格式进行串行和SMP并行处理的LAPACK完整例程的性能几乎相同,而使用的存储空间仅为一半。使用带有RFPF的供应商LAPACK完整例程,与使用供应商和/或参考打包例程相比,串行的性能提升大约是43倍,而SMP并行性能则是97倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号