首页> 外文会议>International conference on high performance computing >Lattice-CSC: Optimizing and Building an Efficient Supercomputer for Lattice-QCD and to Achieve First Place in Green500
【24h】

Lattice-CSC: Optimizing and Building an Efficient Supercomputer for Lattice-QCD and to Achieve First Place in Green500

机译:Lattice-CSC:优化和构建适用于Lattice-QCD的高效超级计算机,并在Green500中获得第一名

获取原文

摘要

In the last decades, supercomputers have become a necessity in science and industry. Huge data centers consume enormous amounts of electricity and we are at a point where newer, faster computers must no longer drain more power than their predecessors. The fact that user demand for compute capabilities has not declined in any way has led to studies of the feasibility of exaflop systems. Heterogeneous clusters with highly-efficient accelerators such as GPUs are one approach to higher efficiency. We present the new L-CSC cluster, a commodity hardware compute cluster dedicated to Lattice QCD simulations at the GSI research facility. L-CSC features a multi-GPU design with four FirePro S9150 GPUs per node providing 320 GB/s memory bandwidth and 2.6 TFLOPS peak performance each. The high bandwidth makes it ideally suited for memory-bound LQCD computations while the multi-GPU design ensures superior power efficiency. The November 2014 Green500 list awarded L-CSC the most power-efficient supercomputer in the world with 5270 MFLOPS/W in the Linpack benchmark. This paper presents optimizations to our Linpack implementation HPL-GPU and other power efficiency improvements which helped L-CSC reach this benchmark. It describes our approach for an accurate Green500 power measurement and unveils some problems with the current measurement methodology. Finally, it gives an overview of the Lattice QCD application on L-CSC.
机译:在过去的几十年中,超级计算机已成为科学和工业中的必需品。巨大的数据中心消耗大量电力,而我们正处在一个新的,速度更快的计算机所消耗的功率不再比其前任更高的水平。用户对计算能力的需求丝毫没有下降的事实导致人们对exaflop系统的可行性进行了研究。具有高效加速器(例如GPU)的异构集群是提高效率的一种方法。我们展示了新的L-CSC集群,这是GSI研究机构专用于莱迪思QCD模拟的商用硬件计算集群。 L-CSC采用多GPU设计,每个节点具有四个FirePro S9150 GPU,可提供320 GB / s的内存带宽和每个2.6 TFLOPS的峰值性能。高带宽使其非常适合内存受限LQCD计算,而多GPU设计可确保出色的电源效率。 2014年11月的Green500列表以Linpack基准5270 MFLOPS / W授予L-CSC世界上最节能的超级计算机。本文介绍了我们的Linpack实施HPL-GPU的优化以及其他能效改进,这些均有助于L-CSC达到该基准。它描述了我们用于精确Green500功率测量的方法,并揭示了当前测量方法中的一些问题。最后,它概述了L-CSC上的莱迪思QCD应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号