Multilevel Approaches to Fine Tune Performance of Linear Algebra Libraries

机译：线性代数库精调性能的多级方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a multilevel methodology to improve the performance of parallel codes whose run time increases at a faster rate than the increase in workload. We have derived the conditions under which the proposed methodology improves performance for a simple parallel computing model. Formulas to predict the amount of performance improvement that is attainable are also derived for this simple computing model. The effectiveness of the proposed strategy is demonstrated by applying it to the highly optimized BLAS (Basic Linear Algebra Subprograms) routines cblas_dgemm, cblas_dtrmm and cblas_dsymm from the Intel MKL (Math Kernel Library) on the Intel KNL (Knights Landing) platform. We are able to reduce the run time of MKL cblas_dgemm by 20%, cblas_dtrmm by 15%, and cblas_dsymm by 50% on double-precision matrices of size 16Kx16K. Further, our performance prediction formulas are demonstrated to be accurate on this platform.

机译：我们提出了一种多层次的方法来提高并行代码的性能，并行代码的运行时间以比工作量增加更快的速度增加。我们已经得出了所提出的方法可以提高简单并行计算模型的性能的条件。对于此简单的计算模型，还得出了可以预测的性能改进量的公式。通过将其应用于英特尔KNL（骑士登陆）平台上的英特尔MKL（数学内核库）中高度优化的BLAS（基本线性代数子程序）例程cblas_dgemm，cblas_dtrmm和cblas_dsymm，可以证明该策略的有效性。在大小为16Kx16K的双精度矩阵上，我们可以将MKL的运行时间减少20％，将cblas_dtrmm减少15％，将cblas_dsymm减少50％。此外，我们的性能预测公式在该平台上被证明是准确的。

著录项

来源
《International Symposium on Signal Processing and Information Technology》|2019年|1-6|共6页
会议地点 Ajman(AE)
作者
Sanaz Gheibi; Tania Banerjee; Sanjay Ranka; Sartaj Sahni;
展开▼
作者单位

CISE University of Florida Gainesville FL 32611;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Matrix decomposition; Computers; Layout; Libraries; Computational modeling; Runtime;

机译：矩阵分解；电脑;布局;图书馆；计算建模；运行;

相似文献

外文文献
中文文献
专利

1. sLASs: A fully automatic auto-tuned linear algebra library based on OpenMP extensions implemented in OmpSs (LASs Library) [J] . Pedro Valero-Lara, Sandra Catalán, Xavier Martorell, Journal of Parallel and Distributed Computing . 2020,第Apra期

机译：SLASS：基于OPEMP扩展的全自动自动调谐线性代数库（LASS库）
2. Tuning linear algebra for energy efficiency on multicore machines by adapting the ATLAS library [J] . Thomas Jakobs, Jens Lang, Gudula Rünger, Future generation computer systems . 2018,第MAY期

机译：通过改编ATLAS库，在多核计算机上调整线性代数以提高能效
3. Architecture of an automatically tuned linear algebra library [J] . Javier Cuenca, Domingo Gimenez, Jose Gonzalez Parallel Computing . 2004,第2期

机译：自动调整的线性代数库的体系结构
4. Multilevel Approaches to Fine Tune Performance of Linear Algebra Libraries [C] . Sanaz Gheibi, Tania Banerjee, Sanjay Ranka, International Symposium on Signal Processing and Information Technology . 2018

机译：多级线性代数库微调性能的方法
5. Supramolecular chemistry approaches for fine-tuning physico-chemical properties of active pharmaceutical ingredients. [D] . Ma, Zhenbo. 2009

机译：超分子化学方法可微调活性药物成分的理化性质。
6. Fine-tuning the expression of pathway gene in yeast using a regulatory library formed by fusing a synthetic minimal promoter with different Kozak variants [O] . Liping Xu, Pingping Liu, Zhubo Dai, 2021

机译：用不同的kozak变体融合合成最小启动子形成的调节文库微调酵母中途径基因的表达
7. Ginkgo: A high performance numerical linear algebra library [O] . Hartwig Anzt, Terry Cojean, Yen-Chen Chen, 2020

机译：银杏：高性能数值线性代数库
8. Design of linear algebra libraries for high performance computers. [R] . Dongarra, J. J., Walker, D. W. 1993

机译：高性能计算机线性代数库的设计。

Multilevel Approaches to Fine Tune Performance of Linear Algebra Libraries

摘要

著录项

相似文献

相关主题

期刊订阅