首页> 外文期刊>Parallel Computing >Adaptive block size for dense QR factorization in hybrid CPU-GPU systems via statistical modeling
【24h】

Adaptive block size for dense QR factorization in hybrid CPU-GPU systems via statistical modeling

机译:通过统计建模在混合CPU-GPU系统中进行密集QR分解的自适应块大小

获取原文
获取原文并翻译 | 示例

摘要

QR factorization is a computational kernel of scientific computing. How can the latest computer be used to accelerate this task? We investigate this topic by proposing a dense QR factorization algorithm with adaptive block sizes on a hybrid system that contains a central processing unit (CPU) and a graphic processing unit (GPU). To maximize the use of CPU and CPU, we develop an adaptive scheme that chooses block size at each iteration. The decision is based on statistical surrogate models of performance and an online monitor, which avoids unexpected occasional performance drops. We modify the highly optimized CPU-GPU based QR factorization in MAGMA to implement the proposed schemes. Numerical results suggest that our approaches are efficient and can lead to near-optimal block sizes. The proposed algorithm can be extended to other one-sided factorizations, such as LU and Cholesky factorizations.
机译:QR分解是科学计算的计算内核。如何使用最新的计算机来加速此任务?我们通过在包含中央处理单元(CPU)和图形处理单元(GPU)的混合系统上提出具有自适应块大小的密集QR分解算法来研究此主题。为了最大程度地利用CPU和CPU,我们开发了一种自适应方案,该方案在每次迭代时选择块大小。该决定基于性能的统计替代模型和一个在线监视器,从而避免了意外的偶尔性能下降。我们修改了MAGMA中高度优化的基于CPU-GPU的QR分解,以实现所提出的方案。数值结果表明,我们的方法是有效的,并且可以导致接近最佳的块大小。所提出的算法可以扩展到其他单方面分解,例如LU和Cholesky分解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号