首页> 外文期刊>Parallel Computing >A CPU-GPU hybrid approach for the unsymmetric multifrontal method
【24h】

A CPU-GPU hybrid approach for the unsymmetric multifrontal method

机译:非对称多面方法的CPU-GPU混合方法

获取原文
获取原文并翻译 | 示例

摘要

Multifrontal is an efficient direct method for solving large-scale sparse and unsymmetric linear systems. The method transforms a large sparse matrix factorization process into a sequence of factorizations involving smaller dense frontal matrices. Some of these dense operations can be accelerated by using a graphic processing unit (GPU). We analyze the unsymmetric multifrontal method from both an algorithmic and implementational perspective to see how a GPU, in particular the NVIDIA Tesla C2070, can be used to accelerate the computations. Our main accelerating strategies include (i) performing BLAS on both CPU and GPU, (ii) improving the communication efficiency between the CPU and GPU by using page-locked memory, zero-copy memory, and asynchronous memory copy, and (iii) a modified algorithm that reuses the memory between different GPU tasks and sets thresholds to determine whether certain tasks be performed on the GPU. The proposed acceleration strategies are implemented by modifying UMFPACK, which is an unsymmetric multifrontal linear system solver. Numerical results show that the CPU-GPU hybrid approach can accelerate the unsymmetric multifrontal solver, especially for computationally expensive problems.
机译:多面是解决大规模稀疏和不对称线性系统的一种有效的直接方法。该方法将大型稀疏矩阵分解过程转换为一系列涉及较小密集前额矩阵的分解。通过使用图形处理单元(GPU)可以加速其中一些密集的操作。我们从算法和实现的角度分析了非对称多面方法,以了解如何使用GPU(尤其是NVIDIA Tesla C2070)来加速计算。我们的主要加速策略包括(i)在CPU和GPU上均执行BLAS;(ii)通过使用页面锁定存储器,零拷贝存储器和异步存储器拷贝来提高CPU和GPU之间的通信效率;以及(iii)a修改后的算法,可在不同的GPU任务之间重用内存并设置阈值,以确定是否在GPU上执行某些任务。提出的加速策略是通过修改UMFPACK来实现的,UMFPACK是一种不对称的多前沿线性系统求解器。数值结果表明,CPU-GPU混合方法可以加快非对称多面求解器的速度,特别是对于计算量大的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号