【24h】

GPU Accelerated Parallel Cholesky Factorization

机译:GPU加速的并行Cholesky因式分解

获取原文

摘要

One of the fundamental problems in scientific computing is to find solutions for linear equation systems. For finite element problem, Cholesky factorization is often used to solve symmetric positive definite matrices. In this paper, Cholesky factorization is massively parallelized and three different optimization methods - highly parallel factorization, tile strategy and memory scheduling are used to accelerate Cholesky factorization effectively. A novel algorithm using OpenCL is implemented. Testing on GPU shows that performance of the algorithm increases with the dimension of matrix, reaching 785.41GFlops, about 50x times speedup. Cholesky factorization is remarkably improved with OpenCL on GPU.
机译:科学计算的基本问题之一是找到线性方程组的解。对于有限元问题,常将Cholesky分解用于求解对称正定矩阵。本文对Cholesky因子分解进行了大规模并行化,并使用了三种不同的优化方法-高并行因子分解,切片策略和内存调度来有效地加速Cholesky因子分解。实现了一种使用OpenCL的新颖算法。在GPU上进行的测试表明,该算法的性能随矩阵尺寸的增加而提高,达到785.41GFlops,约为加速的50倍。在GPU上使用OpenCL显着改善了Cholesky分解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号