首页> 外文会议>International Symposium on Signal Processing and Information Technology >Multilevel Approaches to Fine Tune Performance of Linear Algebra Libraries
【24h】

Multilevel Approaches to Fine Tune Performance of Linear Algebra Libraries

机译:线性代数库精调性能的多级方法

获取原文

摘要

We propose a multilevel methodology to improve the performance of parallel codes whose run time increases at a faster rate than the increase in workload. We have derived the conditions under which the proposed methodology improves performance for a simple parallel computing model. Formulas to predict the amount of performance improvement that is attainable are also derived for this simple computing model. The effectiveness of the proposed strategy is demonstrated by applying it to the highly optimized BLAS (Basic Linear Algebra Subprograms) routines cblas_dgemm, cblas_dtrmm and cblas_dsymm from the Intel MKL (Math Kernel Library) on the Intel KNL (Knights Landing) platform. We are able to reduce the run time of MKL cblas_dgemm by 20%, cblas_dtrmm by 15%, and cblas_dsymm by 50% on double-precision matrices of size 16Kx16K. Further, our performance prediction formulas are demonstrated to be accurate on this platform.
机译:我们提出了一种多层次的方法来提高并行代码的性能,并行代码的运行时间以比工作量增加更快的速度增加。我们已经得出了所提出的方法可以提高简单并行计算模型的性能的条件。对于此简单的计算模型,还得出了可以预测的性能改进量的公式。通过将其应用于英特尔KNL(骑士登陆)平台上的英特尔MKL(数学内核库)中高度优化的BLAS(基本线性代数子程序)例程cblas_dgemm,cblas_dtrmm和cblas_dsymm,可以证明该策略的有效性。在大小为16Kx16K的双精度矩阵上,我们可以将MKL的运行时间减少20%,将cblas_dtrmm减少15%,将cblas_dsymm减少50%。此外,我们的性能预测公式在该平台上被证明是准确的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号