High- performance linear algebra algorithms using new generalized data structures for matrices

F. G. Gustavson

首页> 外文期刊>IBM Journal of Research and Development >High- performance linear algebra algorithms using new generalized data structures for matrices

【24h】

High- performance linear algebra algorithms using new generalized data structures for matrices

机译：使用针对矩阵的新型广义数据结构的高性能线性代数算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a novel way to produce dense linear algebra factorization algorithms. The current state-of-the-art (SOA) dense linear algebra algorithms have a performance inefficiency, and thus they give suboptimal performance for most LAPACK factorizations. We show that using standard Fortran and C two-dimensional arrays is the main source of this inefficiency. For the other standard format (packed one-dimensional arrays for symmetric and/or triangular matrices), the situation is much worse. We show how to correct these performance inefficiencies by using new data structures (NDS) along with so-called kernel routines. The NDS generalize the current storage layouts for both standard formats. We use the concept of Equivalence and Elementary Matrices along with coordinate (linear) transformations to prove that our method works for an entire class of dense linear algebra algorithms. Also, we use the Algorithms and Architecture approach to explain why our new method gives higher efficiency. The simplest forms of the new factorization algorithms are a direct generalization of the commonly used LINPACK algorithms. On IBM platforms they can be generated from simple, textbook-type codes by the XLF Fortran compiler. On the IBM POWER3 processor, our implementation of Cholesky factorization achieves 92% of peak performance, whereas conventional SOA full-format LAPACK DPOTRF achieves 77% of peak performance. All programming for our NDS can be accomplished in standard Fortran through the use of three-and four-dimensional arrays. Thus, no new compiler support is necessary. Finally, we describe block hybrid formats (BHF). BHF allow one to use no additional storage over conventional (full and packed) matrix storage. This means that new algorithms based on BHF can be used as a backward-compatible replacement for LAPACK or LINPACK algorithms.

机译：我们提出了一种新颖的方法来产生密集的线性代数分解算法。当前的最新技术（SOA）密集线性代数算法的性能低下，因此对于大多数LAPACK分解，它们给出的性能都不理想。我们证明了使用标准的Fortran和C二维数组是这种效率低下的主要原因。对于其他标准格式（对称和/或三角形矩阵的打包一维数组），情况要糟得多。我们展示了如何通过使用新的数据结构（NDS）和所谓的内核例程来纠正这些性能低下的问题。 NDS概括了两种标准格式的当前存储布局。我们使用等价和初等矩阵的概念以及坐标（线性）变换来证明我们的方法适用于整个密集线性代数算法。另外，我们使用算法和体系结构方法来解释为什么我们的新方法具有更高的效率。新因子分解算法的最简单形式是常用LINPACK算法的直接概括。在IBM平台上，可以通过XLF Fortran编译器从简单的教科书类型代码生成它们。在IBM POWER3处理器上，我们实施的Cholesky分解实现了92％的峰值性能，而传统的SOA全格式LAPACK DPOTRF则实现了77％的峰值性能。通过使用三维数组，可以在标准Fortran中完成对NDS的所有编程。因此，不需要新的编译器支持。最后，我们描述了块混合格式（BHF）。 BHF允许人们不使用传统（完整和打包）矩阵存储之外的任何存储。这意味着基于BHF的新算法可以用作LAPACK或LINPACK算法的向后兼容替代。

著录项

来源
《IBM Journal of Research and Development》 |2003年第1期|p.31-55|共25页
作者
F. G. Gustavson;
展开▼
作者单位

IBM Research Division, Thomas J. Watson Research Center, P.O. Box 218, Yorktown Heights, New York 10598;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. High-performance linear algebra algorithms using new generalized data structures for matrices [J] . IBM Journal of Research and Development . 2003,第1期

机译：使用针对矩阵的新型广义数据结构的高性能线性代数算法
2. Improving performance of linear algebra algorithms for dense matrices, using algorithmic prefetch [J] . IBM Journal of Research and Development . 1994,第3期

机译：使用算法预取来提高线性矩阵的稠密矩阵算法的性能
3. ALGORITHMS OF PARALLEL COMPUTATIONS FOR LINEAR ALGEBRA PROBLEMS WITH IRREGULARLY STRUCTURED MATRICES [J] . A. N. Khimich, A. V. Popov, V. V. Polyanko Cybernetics and Systems Analysis . 2011,第6期

机译：具有不规则矩阵的线性代数问题的并行计算算法
4. New Generalized Data Structures for Matrices Lead to a Variety of High Performance Dense Linear Algebra Algorithms [C] . Fred G. Gustavson International Workshop on Applied Parallel Computing . 2006

机译：矩阵的新的广义数据结构导致各种高性能致密线性代数算法
5. Topics in numerical linear algebra related to quasiseparable and other structured matrices. [D] . Bella, Thomas J., Jr. 2008

机译：数值线性代数中与拟可分矩阵和其他结构化矩阵有关的主题。
6. Testing the Relative Performance of Data Adaptive Prediction Algorithms: A Generalized Test of Conditional Risk Differences [O] . Benjamin A. Goldstein, Eric C. Polley, Farren B. S. Briggs, -1

机译：测试数据自适应预测算法的相对性能：条件风险差异的通用测试
7. Structured Numerical Linear and Multilinear Algebra: Analysis, Algorithms and Applications: Editorial [O] . Bini Dario, Mitrouli Marilena, Van Barel Marc, 2016

机译：结构化数值线性和多线性代数：分析，算法和应用：社论

High- performance linear algebra algorithms using new generalized data structures for matrices

摘要

著录项

相似文献

相关主题

期刊订阅