首页> 外文会议>IEEE International Parallel Distributed Processing Symposium >Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations
【24h】

Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations

机译:优化稀疏矩阵-多个向量乘法以进行核构型相互作用计算

获取原文

摘要

Obtaining highly accurate predictions on the properties of light atomic nuclei using the configuration interaction (CI) approach requires computing a few extremal Eigen pairs of the many-body nuclear Hamiltonian matrix. In the Many-body Fermion Dynamics for nuclei (MFDn) code, a block Eigen solver is used for this purpose. Due to the large size of the sparse matrices involved, a significant fraction of the time spent on the Eigen value computations is associated with the multiplication of a sparse matrix (and the transpose of that matrix) with multiple vectors (SpMM and SpMM_T). Existing implementations of SpMM and SpMM_T significantly underperform expectations. Thus, in this paper, we present and analyze optimized implementations of SpMM and SpMM_T. We base our implementation on the compressed sparse blocks (CSB) matrix format and target systems with multi-core architectures. We develop a performance model that allows us to understand and estimate the performance characteristics of our SpMM kernel implementations, and demonstrate the efficiency of our implementation on a series of real-world matrices extracted from MFDn. In particular, we obtain 3-4 speedup on the requisite operations over good implementations based on the commonly used compressed sparse row (CSR) matrix format. The improvements in the SpMM kernel suggest we may attain roughly a 40% speed up in the overall execution time of the block Eigen solver used in MFDn.
机译:使用配置相互作用(CI)方法获得对轻原子核性质的高度准确的预测需要计算多体核哈密顿矩阵的一些极端本征对。在核的多体费米子动力学(MFDn)代码中,为此目的使用了一个块本征求解器。由于所涉及的稀疏矩阵的大小很大,因此本征值计算所花费的时间的很大一部分与稀疏矩阵(以及该矩阵的转置)与多个矢量(SpMM和SpMM_T)的乘积相关。 SpMM和SpMM_T的现有实现大大低于预期。因此,在本文中,我们提出并分析了SpMM和SpMM_T的优化实现。我们的实现基于压缩稀疏块(CSB)矩阵格式和具有多核体系结构的目标系统。我们开发了一个性能模型,使我们能够了解和估算SpMM内核实现的性能特征,并在从MFDn中提取的一系列真实矩阵上演示实现效率。特别是,我们基于常用的压缩稀疏行(CSR)矩阵格式,通过良好的实现在必要的操作上获得3-4的加速。 SpMM内核的改进表明,我们可以将MFDn中使用的块本征求解器的总体执行时间提高大约40%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号