首页> 外文期刊>Computers, IEEE Transactions on >Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs
【24h】

Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs

机译:在GPU和多核CPU上使用分区SpMV进行性能优化

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a sparse matrix partitioning strategy to improve the performance of SpMV on GPUs and multicore CPUs. This method has wide adaptability for different types of sparse matrices, and is different from existing methods which only adapt to some particular sparse matrices. In addition, our partitioning method can obtain dense blocks by analyzing the probability distribution of non-zero elements in a sparse matrix, and result in very low proportion of zero padded. We make the following significant contributions. (1) We present a partitioning strategy of sparse matrices based on probabilistic modeling of non-zero elements in a row. (2) We prove that our method has the highest mean density compared with other strategies according to certain given ratios of partition obtained from the computing powers of heterogeneous processors. (3) We develop a CPU-GPU hybrid parallel computing model for SpMV on GPUs and multicore CPUs in a heterogeneous computing platform. Our partitioning strategy has balanced load distribution and the performance of SpMV is significantly improved when a sparse matrix is partitioned into dense blocks using our method. The average performance improvement of our solution for SpMV is about 15.75 percent on multicore CPUs, compared to that of the other solutions. By considering the rows of a matrix in a unique order based on the probability mass function of the number of non-zeros in a row, the average performance improvement of our solution for SpMV is about 33.52 percent on GPUs and multicore CPUs of a heterogeneous computing platform, compared to that of the partitioning methods based on the original row order of a matrix.
机译:本文提出了一种稀疏矩阵划分策略,以提高SpMV在GPU和多核CPU上的性能。该方法对不同类型的稀疏矩阵具有广泛的适应性,并且与仅适用于某些特定稀疏矩阵的现有方法不同。另外,我们的分区方法可以通过分析稀疏矩阵中非零元素的概率分布来获得密集块,从而导致零填充的比例非常低。我们做出以下重要贡献。 (1)我们基于连续非零元素的概率建模提出了稀疏矩阵的分区策略。 (2)我们证明,根据从异构处理器的计算能力获得的特定给定分区比率,我们的方法与其他策略相比具有最高的平均密度。 (3)我们为异构计算平台中的GPU和多核CPU上的SpMV开发了CPU-GPU混合并行计算模型。当使用我们的方法将稀疏矩阵划分为密集块时,我们的分区策略具有平衡的负载分配,并且SpMV的性能得到了显着改善。与其他解决方案相比,我们针对SpMV的解决方案在多核CPU上的平均性能提升约为15.75%。通过基于行中非零数目的概率质量函数以唯一的顺序考虑矩阵的行,我们的SpMV解决方案在异构计算的GPU和多核CPU上的平均性能提高约为33.52%与基于矩阵原始行顺序的分区方法相比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号