首页> 外文期刊>SIAM Journal on Scientific Computing >AMGX: A LIBRARY FOR GPU ACCELERATED ALGEBRAIC MULTIGRID AND PRECONDITIONED ITERATIVE METHODS
【24h】

AMGX: A LIBRARY FOR GPU ACCELERATED ALGEBRAIC MULTIGRID AND PRECONDITIONED ITERATIVE METHODS

机译:AMGX:GPU加速的代数多重网格和预设迭代方法的库

获取原文
获取原文并翻译 | 示例
           

摘要

The solution of large sparse linear systems arises in many applications, such as computational fluid dynamics and oil reservoir simulation. In realistic cases the matrices are often so large that they require large scale distributed parallel computing to obtain the solution of interest in a reasonable time. In this paper we discuss the design and implementation of the AmgX library, which provides drop-in GPU acceleration of distributed algebraic multigrid (AMG) and preconditioned iterative methods. The AmgX library implements both classical and aggregation-based AMG methods with different selector and interpolation strategies, along with a variety of smoothers and preconditioners, including block-Jacobi, Gauss-Seidel, and incomplete-LU factorization. The library contains many of the standard and flexible preconditioned Krylov subspace iterative methods, which can be combined with any of the available multigrid methods or simpler preconditioners. The parallelism in the aggregation scheme exploits parallel graph matching techniques, while the smoothers and preconditioners often rely on parallel graph coloring algorithms. The AMG algorithm implemented in the AmgX library achieves 2-5x speedup on a single GPU against a competitive implementation on the CPU. As will be shown in the numerical experiments section, both setup and solve phases scale well across multiple nodes, sustaining this performance advantage.
机译:大型稀疏线性系统的解决方案出现在许多应用中,例如计算流体动力学和油藏模拟。在现实情况下,矩阵通常是如此之大,以至于它们需要大规模的分布式并行计算才能在合理的时间内获得感兴趣的解决方案。在本文中,我们讨论了AmgX库的设计和实现,该库提供了分布式代数多重网格(AMG)的嵌入式GPU加速和预处理的迭代方法。 AmgX库使用不同的选择器和插值策略以及包括块Block-Jacobi,Gauss-Seidel和Incomplete-LU因式分解在内的各种平滑器和预处理器,实现了经典AMG方法和基于聚合的AMG方法。该库包含许多标准且灵活的预处理Krylov子空间迭代方法,可以将其与任何可用的多网格方法或更简单的预处理器结合使用。聚合方案中的并行性利用并行图匹配技术,而平滑器和预处理器通常依赖于并行图着色算法。在AmgX库中实现的AMG算法在单个GPU上实现了2-5倍的加速,而在CPU上却没有竞争性的实现。如数值实验部分所示,设置阶段和求解阶段都可以在多个节点上很好地扩展,从而保持了这种性能优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号