首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >A Hybrid Parallel Solving Algorithm on GPU for Quasi-Tridiagonal System of Linear Equations
【24h】

A Hybrid Parallel Solving Algorithm on GPU for Quasi-Tridiagonal System of Linear Equations

机译:拟三对角线性方程组的GPU混合并行求解算法

获取原文
获取原文并翻译 | 示例

摘要

There are some quasi-tridiagonal system of linear equations arising from numerical simulations, and some solving algorithms encounter great challenge on solving quasi-tridiagonal system of linear equations with more than millions of dimensions as the scale of problems increases. We present a solving method which mixes direct and iterative methods, and our method needs less storage space in a computing process. A quasi-tridiagonal matrix is split into a tridiagonal matrix and a sparse matrix using our method and then the tridiagonal equation can be solved by the direct methods in the iteration processes. Because the approximate solutions obtained by the direct methods are closer to the exact solutions, the convergence speed of solving the quasi-tridiagonal system of linear equations can be improved. Furthermore, we present an improved cyclic reduction algorithm using a partition strategy to solve tridiagonal equations on GPU, and the intermediate data in computing are stored in shared memory so as to significantly reduce the latency of memory access. According to our experiments on 10 test cases, the average number of iterations is reduced significantly by using our method compared with Jacobi, GS, GMRES, and BiCG respectively, and close to those of BiCGSTAB, BiCRSTAB, and TFQMR. For parallel mode, the parallel computing efficiency of our method is raised by partition strategy, and the performance using our method is better than those of the commonly used iterative and direct methods because of less amount of calculation in an iteration.
机译:数值模拟产生了一些线性方程的拟三对角线系统,随着问题规模的扩大,一些求解算法在求解具有数百万个以上维的线性方程组的拟三对角线系统时遇到了巨大的挑战。我们提出了一种混合了直接方法和迭代方法的求解方法,并且该方法在计算过程中需要较少的存储空间。用我们的方法将准三对角矩阵分解为三对角矩阵和稀疏矩阵,然后可以在迭代过程中通过直接方法求解三对角方程。由于通过直接方法获得的近似解更接近于精确解,因此可以提高求解线性方程组的拟对角线系统的收敛速度。此外,我们提出了一种改进的循环归约算法,采用分区策略在GPU上求解三对角线方程,并将计算中的中间数据存储在共享内存中,从而显着减少了内存访问的延迟。根据我们在10个测试用例上的实验,与Jacobi,GS,GMRES和BiCG相比,使用本方法显着减少了平均迭代次数,并且接近BiCGSTAB,BiCRSTAB和TFQMR。对于并行模式,我们的方法通过分区策略提高了并行计算效率,并且由于迭代次数较少,因此使用该方法的性能优于常用的迭代和直接方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号