首页> 外文期刊>ACM transactions on mathematical software >Design and Implementation of Adaptive SpMV Library for Multicore and Many-Core Architecture
【24h】

Design and Implementation of Adaptive SpMV Library for Multicore and Many-Core Architecture

机译:适用于多核和多核架构的自适应SpMV库的设计与实现

获取原文
获取原文并翻译 | 示例
           

摘要

Sparse matrix vector multiplication (SpMV) is an important computational kernel in traditional highperformance computing and emerging data-intensive applications. Previous SpMV libraries are optimized by either application-specific or architecture-specific approaches but present difficulties for use in real applications. In this work, we develop an auto-tuning system (SMATER) to bridge the gap between specific optimizations and general-purpose use. SMATER provides programmers a unified interface based on the compressed sparse row (CSR) sparse matrix format by implicitly choosing the best format and fastest implementation for any input sparse matrix during runtime. SMATER leverages a machine-learning model and retargetable back-end library to quickly predict the optimal combination. Performance parameters are extracted from 2,386 matrices in the SuiteSparse matrix collection. The experiments show that SMATER achieves good performance (up to 10 times that of the Intel Math Kernel Library (MKL) on Intel E5-2680 v3) while being portable on state-of-the-art x86 multicore processors, NVIDIA GPUs, and Intel Xeon Phi accelerators. Compared with the Intel MKL library, SMATER runs faster by more than 2.5 times on average. We further demonstrate its adaptivity in an algebraic multigrid solver from the Hypre library and report greater than 20% performance improvement.
机译:稀疏矩阵向量乘法(SpMV)是传统高性能计算和新兴数据密集型应用程序中的重要计算内核。以前的SpMV库通过特定于应用程序或特定于体系结构的方法进行了优化,但是在实际应用程序中存在困难。在这项工作中,我们开发了一种自动调整系统(SMATER),以弥合特定优化和通用用途之间的差距。 SMATER通过在运行时为任何输入稀疏矩阵隐式选择最佳格式和最快实现,为程序员提供了基于压缩稀疏行(CSR)稀疏矩阵格式的统一接口。 SMATER利用机器学习模型和可重定位的后端库来快速预测最佳组合。性能参数是从SuiteSparse矩阵集合中的2386个矩阵中提取的。实验表明,SMATER具有出色的性能(是Intel E5-2680 v3上Intel Math Kernel Library(MKL)的10倍),并且可以在最新的x86多核处理器,NVIDIA GPU和Intel上移植至强融核加速器。与Intel MKL库相比,SMATER运行速度平均快2.5倍以上。我们在Hypre库的代数多网格求解器中进一步证明了其适应性,并报告性能提高了20%以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号