Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster

Nakao Masahiro; Odajima Tetsuya; Murai Hitoshi; Tabuchi Akihiro; Fujita Norihisa; Hanawa Toshihiro; Boku Taisuke; Sato Mitsuhisa

首页> 外文期刊>International Journal of High Performance Computing Applications >Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster

【24h】

Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster

机译：在加速集群上使用紧密耦合的加速器/ InfiniBand混合通信对XcalableACC进行评估

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Accelerated clusters, which are cluster systems equipped with accelerators, are one of the most common systems in parallel computing. In order to exploit the performance of such systems, it is important to reduce communication latency between accelerator memories. In addition, there is also a need for a programming language that facilitates the maintenance of high performance by such systems. The goal of the present article is to evaluate XcalableACC (XACC), a parallel programming language, with tightly coupled accelerators/InfiniBand (TCAs/IB) hybrid communication on an accelerated cluster. TCA/IB hybrid communication is a combination of low-latency communication with TCA and high bandwidth with IB. The XACC language, which is a directive-based language for accelerated clusters, enables programmers to use TCA/IB hybrid communication with ease. In order to evaluate the performance of XACC with TCA/IB hybrid communication, we implemented the lattice quantum chromodynamics (LQCD) mini-application and evaluated the application on our accelerated cluster using up to 64 compute nodes. We also implemented the LQCD mini-application using a combination of CUDA and MPI (CUDA + MPI) and that of OpenACC and MPI (OpenACC + MPI) for comparison with XACC. Performance evaluation revealed that the performance of XACC with TCA/IB hybrid communication is 9% better than that of CUDA + MPI and 18% better than that of OpenACC + MPI. Furthermore, the performance of XACC was found to further increase by 7% by new expansion to XACC. Productivity evaluation revealed that XACC requires much less change from the serial LQCD code to implement the parallel LQCD code than CUDA + MPI and OpenACC + MPI. Moreover, since XACC can perform parallelization while maintaining the sequential code image, XACC is highly readable and shows excellent portability due to its directive-based approach.

机译：加速集群是配备加速器的集群系统，是并行计算中最常见的系统之一。为了利用这种系统的性能，重要的是减少加速器存储器之间的通信等待时间。另外，还需要一种编程语言，以促进这种系统保持高性能。本文的目标是评估XcalableACC（XACC），这是一种并行编程语言，在加速的群集上具有紧密耦合的加速器/ InfiniBand（TCA / IB）混合通信。 TCA / IB混合通信是使用TCA的低延迟通信和使用IB的高带宽的组合。 XACC语言是用于加速集群的基于指令的语言，它使程序员可以轻松地使用TCA / IB混合通信。为了评估具有TCA / IB混合通信的XACC的性能，我们实施了点阵量子色动力学（LQCD）微型应用程序，并使用多达64个计算节点在加速集群上评估了该应用程序。我们还结合使用CUDA和MPI（CUDA + MPI）以及OpenACC和MPI（OpenACC + MPI）的LQCD mini应用程序来与XACC进行比较。性能评估表明，具有TCA / IB混合通信的XACC的性能比CUDA + MPI的性能高9％，比OpenACC + MPI的性能高18％。此外，通过向XACC的新扩展，发现XACC的性能进一步提高了7％。生产力评估显示，与CUDA + MPI和OpenACC + MPI相比，XACC对串行LQCD代码进行并行LQCD代码所需的更改少得多。此外，由于XACC可以在保持顺序代码图像的同时执行并行化，因此XACC具有很高的可读性，并且由于其基于指令的方法而具有出色的可移植性。

著录项

来源
《International Journal of High Performance Computing Applications》 |2019年第5期|869-884|共16页
作者
Nakao Masahiro; Odajima Tetsuya; Murai Hitoshi; Tabuchi Akihiro; Fujita Norihisa; Hanawa Toshihiro; Boku Taisuke; Sato Mitsuhisa;
展开▼
作者单位

RIKEN Ctr Computat Sci Kobe Hyogo Japan;

Fujitsu Labs Ltd Kawasaki Kanagawa Japan;

Univ Tsukuba Ctr Computat Sci Tsukuba Ibaraki Japan;

Univ Tokyo Informat Technol Ctr Tokyo Japan;

Univ Tsukuba Grad Sch Syst & Informat Engn Tsukuba Ibaraki Japan|Univ Tsukuba Ctr Computat Sci Tsukuba Ibaraki Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Parallel programming language; interconnect; heterogeneous system; GPU; lattice quantum chromodynamics;

机译：并行编程语言;互连异构系统GPU;晶格量子色动力学;

相似文献

外文文献
中文文献
专利

1. Architecture Support for Tightly-Coupled Multi-Core Clusters with Shared-Memory HW Accelerators [J] . Dehyadegari Masoud, Marongiu Andrea, Kakoee Mohammad Reza, Computers, IEEE Transactions on . 2015,第8期

机译：具有共享内存硬件加速器的紧密耦合多核群集的体系结构支持
2. directCell: Hybrid systems with tightly coupled accelerators [J] . IBM Journal of Research and Development . 2009,第5期

机译：directCell：具有紧密耦合的加速器的混合系统
3. Communication: Spin densities within a unitary group based spin-adapted open-shell coupled-cluster theory: Analytic evaluation of isotropic hyperfine-coupling constants for the combinatoric open-shell coupled-cluster scheme [J] . Datta Dipayan, Gauss Juergen The Journal of Chemical Physics . 2015,第1期

机译：交流：基于unit群的自旋适应性开壳耦合簇理论中的自旋密度：组合式开壳耦合簇方案的各向同性超细耦合常数的解析评估
4. Hybrid Communication with TCA and InfiniBand on a Parallel Programming Language XcalableACC for GPU Clusters [C] . Odajima Tetsuya, Boku Taisuke, Hanawa Toshihiro, IEEE International Conference on Cluster Computing . 2015

机译：在GPU集群的并行编程语言XcalableACC上与TCA和InfiniBand进行混合通信
5. Designing Efficient MPI and UPC Runtime for Multicore Clusters with InfiniBand, Accelerators and Co-Processors. [D] . Luo, Miao. 2013

机译：使用InfiniBand，加速器和协处理器为多核集群设计高效的MPI和UPC运行时。
6. Evaluating a multicomponent social behaviour change communication strategy to reduce intimate partner violence among married couples: study protocol for a cluster randomized trial in Nepal [O] . Cari Jo Clark, Rachael A. Spencer, Binita Shrestha, 2017

机译：评估多成分社会行为改变沟通策略以减少已婚夫妇之间的亲密伴侣暴力：尼泊尔一项整群随机试验的研究方案
7. Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster [O] . Masahiro Nakao, Tetsuya Odajima, Hitoshi Murai, 2019

机译：XcalableACC与紧密耦合加速器/ Infiniband混合通信的评估

Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster

摘要

著录项

相似文献

相关主题

期刊订阅