首页> 外文期刊>International Journal of High Performance Computing Applications >Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster
【24h】

Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster

机译:在加速集群上使用紧密耦合的加速器/ InfiniBand混合通信对XcalableACC进行评估

获取原文
获取原文并翻译 | 示例
           

摘要

Accelerated clusters, which are cluster systems equipped with accelerators, are one of the most common systems in parallel computing. In order to exploit the performance of such systems, it is important to reduce communication latency between accelerator memories. In addition, there is also a need for a programming language that facilitates the maintenance of high performance by such systems. The goal of the present article is to evaluate XcalableACC (XACC), a parallel programming language, with tightly coupled accelerators/InfiniBand (TCAs/IB) hybrid communication on an accelerated cluster. TCA/IB hybrid communication is a combination of low-latency communication with TCA and high bandwidth with IB. The XACC language, which is a directive-based language for accelerated clusters, enables programmers to use TCA/IB hybrid communication with ease. In order to evaluate the performance of XACC with TCA/IB hybrid communication, we implemented the lattice quantum chromodynamics (LQCD) mini-application and evaluated the application on our accelerated cluster using up to 64 compute nodes. We also implemented the LQCD mini-application using a combination of CUDA and MPI (CUDA + MPI) and that of OpenACC and MPI (OpenACC + MPI) for comparison with XACC. Performance evaluation revealed that the performance of XACC with TCA/IB hybrid communication is 9% better than that of CUDA + MPI and 18% better than that of OpenACC + MPI. Furthermore, the performance of XACC was found to further increase by 7% by new expansion to XACC. Productivity evaluation revealed that XACC requires much less change from the serial LQCD code to implement the parallel LQCD code than CUDA + MPI and OpenACC + MPI. Moreover, since XACC can perform parallelization while maintaining the sequential code image, XACC is highly readable and shows excellent portability due to its directive-based approach.
机译:加速集群是配备加速器的集群系统,是并行计算中最常见的系统之一。为了利用这种系统的性能,重要的是减少加速器存储器之间的通信等待时间。另外,还需要一种编程语言,以促进这种系统保持高性能。本文的目标是评估XcalableACC(XACC),这是一种并行编程语言,在加速的群集上具有紧密耦合的加速器/ InfiniBand(TCA / IB)混合通信。 TCA / IB混合通信是使用TCA的低延迟通信和使用IB的高带宽的组合。 XACC语言是用于加速集群的基于指令的语言,它使程序员可以轻松地使用TCA / IB混合通信。为了评估具有TCA / IB混合通信的XACC的性能,我们实施了点阵量子色动力学(LQCD)微型应用程序,并使用多达64个计算节点在加速集群上评估了该应用程序。我们还结合使用CUDA和MPI(CUDA + MPI)以及OpenACC和MPI(OpenACC + MPI)的LQCD mini应用程序来与XACC进行比较。性能评估表明,具有TCA / IB混合通信的XACC的性能比CUDA + MPI的性能高9%,比OpenACC + MPI的性能高18%。此外,通过向XACC的新扩展,发现XACC的性能进一步提高了7%。生产力评估显示,与CUDA + MPI和OpenACC + MPI相比,XACC对串行LQCD代码进行并行LQCD代码所需的更改少得多。此外,由于XACC可以在保持顺序代码图像的同时执行并行化,因此XACC具有很高的可读性,并且由于其基于指令的方法而具有出色的可移植性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号