首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium Workshops >Threaded Accurate Matrix-Matrix Multiplications with Sparse Matrix-Vector Multiplications
【24h】

Threaded Accurate Matrix-Matrix Multiplications with Sparse Matrix-Vector Multiplications

机译:具有稀疏矩阵矢量乘法的线程精确矩阵矩阵乘法

获取原文

摘要

Basic Linear Algebra Subprograms (BLAS) is a frequently used numerical library for linear algebra computations. However, it places little emphasis on computational accuracy, especially with respect to the accuracy assurance of the results. Although some algorithms for ensuring the computational accuracy of BLAS operations have been studied, there is a need for performance evaluation in advanced computer architectures. In this study, we parallelize high-precision matrix-matrix multiplication using thread-level parallelism. In addition, we conduct a performance evaluation from the viewpoints of execution speed and accuracy. We implement a method to convert dense matrices into sparse matrices by exploiting the nature of the target algorithm and adapting sparse-vector multiplication. Results obtained using the FX100 supercomputer system at Nagoya University indicate that (1) implementation with the ELL format achieves 1.43x speedup and (2) a maximum of 38x speedup compared to conventional implementation for dense matrix operations with dgemm.
机译:基本线性代数子程序(BLA)是用于线性代数计算的常用数值库。然而,它很少强调计算准确性,特别是关于结果的准确性保证。虽然已经研究了一些用于确保BLAS操作的计算准确性的算法,但需要在高级计算机架构中进行性能评估。在这项研究中,我们使用螺纹级并行性并行化高精度矩阵矩阵乘法。此外,我们从执行速度和准确性的角度进行性能评估。我们通过利用目标算法的性质来实现一种将密集矩阵转换为稀疏矩阵的方法,并采用稀疏 - 向量乘法。使用名古屋大学的FX100超级计算机系统获得的结果表明(1)与ELL格式的实现实现1.43倍的加速度和(2)与具有DGEMM的密集矩阵操作的传统实施相比,最大38倍的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号