首页> 外文会议>ACM SIGPLAN Conference on Programming Language Design and Implementation >SMAT: An Input Adaptive Auto-Tuner for Sparse Matrix-Vector Multiplication
【24h】

SMAT: An Input Adaptive Auto-Tuner for Sparse Matrix-Vector Multiplication

机译:SMAT:用于稀疏矩阵矢量乘法的输入自适应自动调谐器

获取原文

摘要

Sparse Matrix Vector multiplication (SpMV) is an important kernel in both traditional high performance computing and emerging data-intensive applications. By far, SpMV libraries are optimized by either application-specific or architecture-specific approaches, making the libraries become too complicated to be used extensively in real applications. In this work we develop a Sparse Matrix-vector multiplication Auto-Tuning system (SMAT) to bridge the gap between specific optimizations and general-purpose usage. S-MAT provides users with a unified programming interface in compressed sparse row (CSR) format and automatically determines the optimal format and implementation for any input sparse matrix at runtime. For this purpose, SMAT leverages a learning model, which is generated in an off-line stage by a machine learning method with a training set of more than 2000 matrices from the UF sparse matrix collection, to quickly predict the best combination of the matrix feature parameters. Our experiments show that SMAT achieves impressive performance of up to 51GFLOPS in single-precision and 37GFLOPS in double-precision on mainstream x86 multi-core processors, which are both more than 3 times faster than the Intel MKL library. We also demonstrate its adaptability in an algebraic multi-grid solver from Hypre library with above 20% performance improvement reported.
机译:稀疏矩阵向量乘法(SPMV)是传统高性能计算和新兴数据密集型应用中的重要内核。到目前为止,SPMV库通过应用程序特定的或特定于架构的方法进行了优化,使图书馆变得过于复杂,无法在真实应用中广泛使用。在这项工作中,我们开发了一个稀疏的矩阵矢量乘法自学系统(SMAT),以弥合特定优化与通用用途之间的差距。 S-MAT为用户提供压缩稀疏行(CSR)格式的统一编程接口,并自动确定运行时在运行时的任何输入稀疏矩阵的最佳格式和实现。为此目的,SMAT利用了一个学习模型,该学习模型是通过从UF稀疏矩阵集合的多于2000多个矩阵的训练集的机器学习方法在离线阶段生成的学习模型,以快速预测矩阵特征的最佳组合参数。我们的实验表明,SMAT在主流X86多核处理器上的单精度和37GFlops中令人印象深刻的51G普通,比英特尔MKL库比英特尔MKL库快3倍以上。我们还展示了Hypre文库的代数多网求解器中的适应性,报告了20%的性能改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号