首页> 外文会议>International Conference for High Performance Computing, Networking, Storage and Analysis >Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices
【24h】

Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices

机译:高性能共轭梯度基准的共享内存实现及其在非结构化矩阵中的应用

获取原文

摘要

A new sparse high performance conjugate gradient benchmark (HPCG) has been recently released to address challenges in the design of sparse linear solvers for the next generation extreme-scale computing systems. Key computation, data access, and communication pattern in HPCG represent building blocks commonly found in today's HPC applications. While it is a well known challenge to efficiently parallelize Gauss-Seidel smoother, the most time-consuming kernel in HPCG, our algorithmic and architecture-aware optimizations deliver 95% and 68% of the achievable bandwidth on Xeon and Xeon Phi, respectively. Based on available parallelism, our Xeon Phi shared-memory implementation of Gauss-Seidel smoother selectively applies block multi-color reordering. Combined with MPI parallelization, our implementation balances parallelism, data access locality, CG convergence rate, and communication overhead. Our implementation achieved 580 TFLOPS (82% parallelization efficiency) on Tianhe-2 system, ranking first on the most recent HPCG list in July 2014. In addition, we demonstrate that our optimizations not only benefit HPCG original dataset, which is based on structured 3D grid, but also a wide range of unstructured matrices.
机译:最近发布了一种新的稀疏高性能共轭梯度基准(HPCG),以解决面向下一代极端规模计算系统的稀疏线性求解器的设计挑战。 HPCG中的关键计算,数据访问和通信模式代表了当今HPC应用程序中常见的构建基块。有效地并行化HPCG中最耗时的Gauss-Seidel平滑器是一个众所周知的挑战,我们的算法和体系结构优化分别在Xeon和Xeon Phi上提供了可实现带宽的95%和68%。基于可用的并行性,我们的Xeon Phi Gauss-Seidel平滑器共享内存实现选择性地应用了块多色重新排序。与MPI并行化相结合,我们的实现在并行性,数据访问位置,CG收敛速度和通信开销之间取得了平衡。我们的实施在Tianhe-2系统上实现了580 TFLOPS(并行化效率为82%),在2014年7月的最新HPCG列表中排名第一。此外,我们证明了我们的优化不仅有益于基于结构化3D的HPCG原始数据集网格,以及各种各样的非结构化矩阵。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号