首页> 外文会议>International Conference on Parallel Processing >Variable-Size Batched LU for Small Matrices and its Integration into Block-Jacobi Preconditioning
【24h】

Variable-Size Batched LU for Small Matrices and its Integration into Block-Jacobi Preconditioning

机译:可变大小的小矩阵批量LU及其集成到块jacobi预处理中

获取原文

摘要

We present a set of new batched CUDA kernels for the LU factorization of a large collection of independent problems of different size, and the subsequent triangular solves. All kernels heavily exploit the registers of the graphics processing unit (GPU) in order to deliver high performance for small problems. The development of these kernels is motivated by the need for tackling this embarrasingly-parallel scenario in the context of block-Jacobi preconditioning that is relevant for the iterative solution of sparse linear systems.
机译:我们提出了一套新的批次CUDA内核,用于LU分解的大量独立问题,以及随后的三角形解决方案。所有内核都大力利用图形处理单元(GPU)的寄存器,以便为小问题提供高性能。这些内核的开发是为了在与稀疏线性系统的迭代解决方案相关的块 - jacobi预处理的情况下解决这种令人尴尬的并行情景。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号