首页> 外文期刊>Computer science >High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning
【24h】

High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning

机译:使用超图分割的多GPU集群上的高性能共轭梯度求解器

获取原文
获取原文并翻译 | 示例
       

摘要

Motivated by high computation power and low price per performance ratio of GPUs, GPU accelerated clusters are being built for high performance scientific computing. In this work, we propose a scalable implementation of a Conjugate Gradient (CG) solver for unstructured matrices on a GPU-extended cluster, where each cluster node has multiple GPUs. Basic computations of the solver are held on GPUs and communications are managed by the CPU. For sparse matrix-vector multiplication, which is the most time-consuming operation, solver selects the fastest between several high performance kernels running on GPUs. In a GPU-extended cluster, it is more difficult than traditional CPU clusters to obtain scalability, since GPUs are very fast compared to CPUs. Since computation on GPUs is faster, GPU-extended clusters demand faster communication between compute units. To achieve scalability, we adopt hypergraph-partitioning models, which are state-of-the-art models for communication reduction and load balancing for parallel sparse iterative solvers. We implement a hierarchical partitioning model which better optimizes underlying heterogeneous system. In our experiments, we obtain up to 94 Gflops double-precision CG performance using 64 NVIDIA Tesla GPUs on 32 nodes.
机译:受GPU的高计算能力和低每性能比价格的推动,正在构建GPU加速集群以用于高性能科学计算。在这项工作中,我们提出了共轭梯度(CG)求解器的可扩展实现,用于GPU扩展的群集上的非结构化矩阵,其中每个群集节点具有多个GPU。求解器的基本计算保存在GPU上,通信由CPU管理。对于最耗时的运算是稀疏矩阵向量乘法,求解器会在GPU上运行的几个高性能内核之间选择最快的算法。在GPU扩展的集群中,获得可伸缩性比传统的CPU集群要困难得多,因为GPU与CPU相比非常快。由于GPU上的计算速度更快,因此GPU扩展的集群要求计算单元之间的通信速度更快。为了实现可伸缩性,我们采用了超图分区模型,这是用于并行稀疏迭代求解器的通信减少和负载平衡的最新模型。我们实现了分层分区模型,可以更好地优化底层异构系统。在我们的实验中,我们在32个节点上使用64个NVIDIA Tesla GPU获得了高达94 Gflops的双精度CG性能。

著录项

  • 来源
    《Computer science》 |2010年第2期|83-91|共9页
  • 作者单位

    Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan;

    rnTokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan;

    rnTokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan National Institute of Informatics, Hitotsubashi 4-5-6, Chiyoda-ku, Tokyo 101-8430, Japan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    GPU computing; GPU cluster; conjugate gradients; hypergraph partitioning;

    机译:GPU计算;GPU集群;共轭梯度超图分割;
  • 入库时间 2022-08-17 13:50:49

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号