【24h】

Rectangular Full Packed Format for LAPACK Algorithms Timings on Several Computers

机译:用于多台计算机的LAPACK算法时序的矩形全压缩格式

获取原文
获取原文并翻译 | 示例

摘要

We describe a new data format for storing triangular and symmetric matrices called RFP (Rectangular Full Packed). The standard two dimensional arrays of Fortran and C (also known as full format) that are used to store triangular and symmetric matrices waste nearly half the storage space but provide high performance via the use of level 3 BLAS. Standard packed format arrays fully utilize storage (array space) but provide low performance as there are no level 3 packed BLAS. We combine the good features of packed and full storage using RFP format to obtain high performance using L3 (level 3) BLAS as RFP is full format. Also, RFP format requires exactly the same minimal storage as packed format. Each full and/or packed symmetric/triangular routine becomes a single new RFP routine. We present LAPACK routines for Cholesky factorization, inverse and solution computation in RFP format to illustrate this new work and to describe its performance on the IBM, Itanium, NEC, and SUN platforms. Performance of RFP versus LAPACK full routines for both serial and SMP parallel processing is about the same while using half the storage. Performance is roughly one to a factor of 33 for serial and one to a factor of 100 for SMP parallel times faster than LAPACK packed routines. Existing LAPACK routines and vendor LAPACK routines were used in the serial and the SMP parallel study, respectively. In both studies vendor L3 BLAS were used.
机译:我们描述了一种新的用于存储三角形和对称矩阵的数据格式,称为RFP(矩形完整包装)。用于存储三角形和对称矩阵的标准Fortran和C二维数组(也称为完整格式)浪费了将近一半的存储空间,但通过使用3级BLAS提供了高性能。标准打包格式阵列充分利用了存储空间(阵列空间),但由于没有3级打包BLAS,所以性能低下。由于RFP是完整格式,因此我们结合了使用RFP格式的打包存储和完整存储的良好功能,以使用L3(第3级)BLAS获得高性能。同样,RFP格式要求的最小存储量与打包格式的存储量完全相同。每个完整和/或打包的对称/三角形例程都成为一个新的RFP例程。我们以RFP格式提供用于Cholesky因式分解,逆运算和解计算的LAPACK例程,以说明这项新工作并描述其在IBM,Itanium,NEC和SUN平台上的性能。在使用一半存储空间的情况下,用于串行和SMP并行处理的RFP与LAPACK完整例程的性能大致相同。与LAPACK打包例程相比,串行的性能大约要高33倍,而SMP并行时间要高100倍。在串行研究和SMP并行研究中分别使用了现有的LAPACK例程和供应商的LAPACK例程。在两项研究中,均使用了供应商L3 BLAS。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号