首页> 外文会议>AIAA aerospace sciences meeting including the new horizons forum and aerospace exposition >Performance of Unstructured Finite Volume Code on a Cluster with Multiple GPUs per Node
【24h】

Performance of Unstructured Finite Volume Code on a Cluster with Multiple GPUs per Node

机译:每个节点具有多个GPU的群集上非结构化有限卷代码的性能

获取原文

摘要

This paper will investigate the performance of an unstructured finite volume code on a multi-CPU, multi-GPU cluster. This cluster attempts to balance IO, GPU, and CPU performance to accommodate a wide variety of codes. A new, unstructured finite volume code running in parallel using MPI/OpenMP and MPI/CUDA is presented. The performance of this code on a purpose-built GPU cluster is examined under a number of operating conditions. The GPU cluster is a collection of 24 compute nodes, each consisting of two, 6-core Intel Core i7 Processors and two NVIDIA GPUs with one to two QDR Infiniband ports connected to a switch. Eight of the compute nodes have two NVIDIA Fermi Tesla GPUs well connected with two QDR Infiniband cards. The remaining 18 have two NVIDIA Fermi Video GPUs. The use of multiple chipsets creates non-uniform access to both the GPUs and Infiniband, potentially creating bottlenecks when transferring data between the CPU and the GPU and between nodes. This paper will also explore these issues as well as potential solutions.
机译:本文将研究非结构化有限卷代码在多CPU,多GPU集群上的性能。该群集尝试平衡IO,GPU和CPU性能以适应各种代码。提出了使用MPI / OpenMP和MPI / CUDA并行运行的新的非结构化有限体积代码。在许多操作条件下,会检查此代码在专用GPU群集上的性能。 GPU集群是24个计算节点的集合,每个计算节点由两个6核Intel Core i7处理器和两个NVIDIA GPU组成,这些GPU具有连接到交换机的一到两个QDR Infiniband端口。八个计算节点具有两个NVIDIA Fermi Tesla GPU,它们与两个QDR Infiniband卡良好连接。其余18个拥有两个NVIDIA Fermi Video GPU。使用多个芯片组会导致对GPU和Infiniband的访问不一致,从而在CPU和GPU之间以及节点之间传输数据时可能会造成瓶颈。本文还将探讨这些问题以及潜在的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号