A Hybrid Parallel Solving Algorithm on GPU for Quasi-Tridiagonal System of Linear Equations

Kenli Li; Wangdong Yang; Keqin Li

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >A Hybrid Parallel Solving Algorithm on GPU for Quasi-Tridiagonal System of Linear Equations

【24h】

A Hybrid Parallel Solving Algorithm on GPU for Quasi-Tridiagonal System of Linear Equations

机译：拟三对角线性方程组的GPU混合并行求解算法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

There are some quasi-tridiagonal system of linear equations arising from numerical simulations, and some solving algorithms encounter great challenge on solving quasi-tridiagonal system of linear equations with more than millions of dimensions as the scale of problems increases. We present a solving method which mixes direct and iterative methods, and our method needs less storage space in a computing process. A quasi-tridiagonal matrix is split into a tridiagonal matrix and a sparse matrix using our method and then the tridiagonal equation can be solved by the direct methods in the iteration processes. Because the approximate solutions obtained by the direct methods are closer to the exact solutions, the convergence speed of solving the quasi-tridiagonal system of linear equations can be improved. Furthermore, we present an improved cyclic reduction algorithm using a partition strategy to solve tridiagonal equations on GPU, and the intermediate data in computing are stored in shared memory so as to significantly reduce the latency of memory access. According to our experiments on 10 test cases, the average number of iterations is reduced significantly by using our method compared with Jacobi, GS, GMRES, and BiCG respectively, and close to those of BiCGSTAB, BiCRSTAB, and TFQMR. For parallel mode, the parallel computing efficiency of our method is raised by partition strategy, and the performance using our method is better than those of the commonly used iterative and direct methods because of less amount of calculation in an iteration.

机译：数值模拟产生了一些线性方程的拟三对角线系统，随着问题规模的扩大，一些求解算法在求解具有数百万个以上维的线性方程组的拟三对角线系统时遇到了巨大的挑战。我们提出了一种混合了直接方法和迭代方法的求解方法，并且该方法在计算过程中需要较少的存储空间。用我们的方法将准三对角矩阵分解为三对角矩阵和稀疏矩阵，然后可以在迭代过程中通过直接方法求解三对角方程。由于通过直接方法获得的近似解更接近于精确解，因此可以提高求解线性方程组的拟对角线系统的收敛速度。此外，我们提出了一种改进的循环归约算法，采用分区策略在GPU上求解三对角线方程，并将计算中的中间数据存储在共享内存中，从而显着减少了内存访问的延迟。根据我们在10个测试用例上的实验，与Jacobi，GS，GMRES和BiCG相比，使用本方法显着减少了平均迭代次数，并且接近BiCGSTAB，BiCRSTAB和TFQMR。对于并行模式，我们的方法通过分区策略提高了并行计算效率，并且由于迭代次数较少，因此使用该方法的性能优于常用的迭代和直接方法。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2016年第10期|2795-2808|共14页
作者
Kenli Li; Wangdong Yang; Keqin Li;
展开▼
作者单位

College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China;

College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China;

College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Sparse matrices; Mathematical model; Graphics processing units; Iterative methods; Instruction sets; Convergence; Partitioning algorithms;

机译：稀疏矩阵;数学模型;图形处理单元;迭代方法;指令集;收敛;分区算法;

相似文献

外文文献
中文文献
专利

1. Parallel algorithms for solving linear systems with block-tridiagonal matrices on multi-core CPU with GPU [J] . Elena N. Akimova, Dmitry V. Belousov Journal of computational science . 2012,第6期

机译：在具有GPU的多核CPU上使用块对角线矩阵求解线性系统的并行算法
2. A Class of Communication-avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines [J] . Marc Baboulin, Simplice Donfack, Jack Dongarra, Procedia Computer Science . 2012,第1期

机译：一类用于避免CPU / GPU并行机上通用密集线性系统的通信避免算法
3. THE　THEORETICAL　COST　OF　SEQUENTIAL　AND　PARALLEL　ALGORITHMS　FOR　SOLVING　LINEAR　SYSTEMS　OF　EQUATIONS [J] . 应用数学和力学：英文版 . 1996,第012期

机译：求解方程线性系统的顺序和并行算法的理论成本
4. Communication-Avoiding Parallel Algorithms for Solving Triangular Systems of Linear Equations [C] . Tobias Wicky, Edgar Solomonik, Torsten Hoefler IEEE International Parallel and Distributed Processing Symposium . 2017

机译：求解线性方程三角系统的避免通讯并行算法
5. Solving large sparse systems of nonlinear equations and nonlinear least squares problems using tensor methods on sequential and parallel computers*. [D] . Bouaricha, Ali. 1992

机译：在连续和并行计算机上使用张量法来求解非线性方程和非线性最小二乘问题的大型稀疏系统。
6. Transforming Lindblad Equations into Systems of Real-Valued Linear Equations: Performance Optimization and Parallelization of an Algorithm [O] . Iosif Meyerov, Evgeny Kozinov, Alexey Liniov, 2020

机译：将Lindblad方程转换为实值线性方程的系统：算法的性能优化和并行化
7. A class of communication-avoiding algorithms for solving general dense linear systems on CPU/GPU parallel machines [O] . Baboulin, Marc, Donfack, Simplice, Dongarra, Jack, 2012

机译：用于在CPU / GPU并行机上求解通用密集线性系统的一类避免通信的算法
8. Hybrid Chebyshev Krylov Subspace Algorithm for Solving Nonsymmetric Systems of Linear Equations [R] . Elman, H. C., Saad, Y., Saylor, P. E. 1984

机译：求解非对称线性方程组的混合Chebyshev Krylov子空间算法

A Hybrid Parallel Solving Algorithm on GPU for Quasi-Tridiagonal System of Linear Equations

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅