In LAPACK there are two types of subroutines for solving problems with symmetric matrices: routines for full and packed storage. The performance of full format is much better as it allows the usage of Level 2 and 3 BLAS whereas the memory requirement of the packed format is about 50% of full. We propose a new storage layout which combines the advantages of both algorithms: its factorization performance is better than that of full storage layout, and its memory requirement is percentage-wise slightly larger than packed storage. Our new algorithms, called DBSSV, DBSTRF, and DBSTRS are now part of ESSL[9]. On three recent IBM RS/6000 platforms, Power3, Power2 and PowerPC 604e DBSTRF outperforms LAPACK's DSYTRF by about 20%, and DBSTRS, with 100 RHS, outperforms LAPACK's DSYTRS by more than 100%. These performance results are decidedly unfair to our new algorithms: we compare against Level 3 algorithms as opposed to Level 2 packed algorithms.
展开▼