首页> 外文期刊>Computer science >Optimized HPL for AMD GPU and multi-core CPU usage
【24h】

Optimized HPL for AMD GPU and multi-core CPU usage

机译:针对AMD GPU和多核CPU使用情况优化了HPL

获取原文
获取原文并翻译 | 示例
       

摘要

The installation of the LOEWE-CSC supercomputer at the Goethe University in Frankfurt lead to the development of a Linpack which can fully utilize the installed AMD Cypress GPUs. At its core, a fast DGEMM for combined GPU and CPU usage was created. The DGEMM library is tuned to hide all DMA transfer times and thus maximize the GPU load. A work stealing scheduler was implemented to add the remaining CPU resources to the DGEMM. On the GPU, the DGEMM achieves 497 GFlop/s (90.9% of the theoretical peak). Combined with the 24-core Magny-Cours CPUs, 623 GFlop/s (83.6% of the peak) are achieved. The HPL benchmark was modified to perform well with one MPI-process per node. The modifications include multi-threading, vectorization, use of the GPU DGEMM, cache optimizations, and a new Lookahead algorithm. A Linpack performance of 70% theoretical peak is achieved and this performance scales linearly to hundreds of nodes.
机译:在法兰克福歌德大学安装LOEWE-CSC超级计算机导致开发了Linpack,该Linpack可以充分利用已安装的AMD Cypress GPU。在其核心处,创建了一个用于GPU和CPU组合使用的快速DGEMM。调整DGEMM库以隐藏所有DMA传输时间,从而最大程度地增加GPU负载。已实施了工作窃取调度程序,以将剩余的CPU资源添加到DGEMM。在GPU上,DGEMM达到497 GFlop / s(理论峰值的90.9%)。结合24核Magny-Cours CPU,可达到623 GFlop / s(峰值的83.6%)。修改了HPL基准,使其在每个节点具有一个MPI进程的情况下表现良好。修改包括多线程,向量化,GPU DGEMM的使用,缓存优化和新的Lookahead算法。 Linpack性能达到理论峰值的70%,并且该性能线性扩展至数百个节点。

著录项

  • 来源
    《Computer science》 |2011年第4期|p.153-164|共12页
  • 作者单位

    Frankfurt Institute for Advanced Studies, Ruth-Moufang-Strasse 1, 60438 Frankfurt am Main, Germany;

    Frankfurt Institute for Advanced Studies, Ruth-Moufang-Strasse 1, 60438 Frankfurt am Main, Germany;

    Frankfurt Institute for Advanced Studies, Ruth-Moufang-Strasse 1, 60438 Frankfurt am Main, Germany;

    Frankfurt Institute for Advanced Studies, Ruth-Moufang-Strasse 1, 60438 Frankfurt am Main, Germany;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    heterogeneous computing; linpack; HPL; DGEMM; CALDGEMM; GPGPU;

    机译:异构计算linpack;HPL;DGEMM;卡德金通用图形处理器;
  • 入库时间 2022-08-17 13:50:28

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号