首页> 外文期刊>Journal of Parallel and Distributed Computing >Batched transpose-free ADI-type preconditioners for a Poisson solver on GPGPUs
【24h】

Batched transpose-free ADI-type preconditioners for a Poisson solver on GPGPUs

机译:用于GPGPU上的泊松求解器的批次无转置的ADI型预处理器

获取原文
获取原文并翻译 | 示例
       

摘要

We investigate the iterative solution of a symmetric positive definite linear system involving the shifted Laplacian as the system matrix on General Purpose Graphics Processing Units (GPGPUs). We consider in particular the Chebyshev iteration for its reduced global communication. The ADI-type preconditioner involves solving multiple (batched) symmetric positive tridiagonal Toeplitz systems along each coordinate direction. We investigate several variants how to solve these tridiagonal systems, the Thomas algorithm, the Thomas combined with the SPIKE algorithm, and a polynomial approximation of the inverse. We test the various implementations numerically by means of two-and three-dimensional examples. It turns out that a combination of the Thomas algorithm and the approximate inverse leads to a solution that does not need either tiling or transpositions. As such none of the kernels uses an extensive amount of shared memory which yields a very high GPU utilization and more imDortantlv optimal coalesced global memorv access patterns.
机译:我们研究了涉及移位的Laplacian作为通用图形处理单元(GPGPU)的系统矩阵的对称正定线性系统的迭代解。我们特别考虑了Chebyshev迭代,以减少全球沟通。 ADI型预配置者涉及沿每个坐标方向求解多个(批量的)对称的阳性三角形Toeplitz系统。我们调查多个变体如何解决这些三角形系统,托马斯算法,托马斯与尖峰算法相结合,以及逆的多项式近似。我们通过双向和三维示例以数字方式测试各种实施方式。事实证明,托马斯算法的组合和近似逆导致不需要平铺或换位的解决方案。因此,没有一个内核使用广泛的共享存储器,其产生非常高的GPU利用率和更多IMDortantLV最佳聚结的全球MEMORV访问模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号