Performance of Unstructured Finite Volume Code on a Cluster with Multiple GPUs per Node

机译：每个节点具有多个GPU的群集上非结构化有限卷代码的性能

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper will investigate the performance of an unstructured finite volume code on a multi-CPU, multi-GPU cluster. This cluster attempts to balance IO, GPU, and CPU performance to accommodate a wide variety of codes. A new, unstructured finite volume code running in parallel using MPI/OpenMP and MPI/CUDA is presented. The performance of this code on a purpose-built GPU cluster is examined under a number of operating conditions. The GPU cluster is a collection of 24 compute nodes, each consisting of two, 6-core Intel Core i7 Processors and two NVIDIA GPUs with one to two QDR Infiniband ports connected to a switch. Eight of the compute nodes have two NVIDIA Fermi Tesla GPUs well connected with two QDR Infiniband cards. The remaining 18 have two NVIDIA Fermi Video GPUs. The use of multiple chipsets creates non-uniform access to both the GPUs and Infiniband, potentially creating bottlenecks when transferring data between the CPU and the GPU and between nodes. This paper will also explore these issues as well as potential solutions.

机译：本文将研究非结构化有限卷代码在多CPU，多GPU集群上的性能。该群集尝试平衡IO，GPU和CPU性能以适应各种代码。提出了使用MPI / OpenMP和MPI / CUDA并行运行的新的非结构化有限体积代码。在许多操作条件下，会检查此代码在专用GPU群集上的性能。 GPU集群是24个计算节点的集合，每个计算节点由两个6核Intel Core i7处理器和两个NVIDIA GPU组成，这些GPU具有连接到交换机的一到两个QDR Infiniband端口。八个计算节点具有两个NVIDIA Fermi Tesla GPU，它们与两个QDR Infiniband卡良好连接。其余18个拥有两个NVIDIA Fermi Video GPU。使用多个芯片组会导致对GPU和Infiniband的访问不一致，从而在CPU和GPU之间以及节点之间传输数据时可能会造成瓶颈。本文还将探讨这些问题以及潜在的解决方案。

著录项

来源
《AIAA aerospace sciences meeting including the new horizons forum and aerospace exposition》|2011年|p.12762-12771|共10页
会议地点
作者
Keith Obenschain; Andrew Corrigan; Gopal Patnaik;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类航空;
关键词

相似文献

外文文献
中文文献
专利

1. Performance and Comparison of Cell-Centered and Node-Centered Unstructured Finite Volume Discretizations for Shallow Water Free Surface Flows [J] . A.I. Delis, I.K. Nikolos, M. Kazolea Archives of Computational Methods in Engineering . 2011,第1期

机译：浅水自由表面流的以单元为中心和以节点为中心的非结构化有限体积离散化的性能和比较
2. Improvement of the computational performance of a parallel unstructured WENO finite volume CFD code for Implicit Large Eddy Simulation [J] . Panagiotis Tsoutsanis, Antonis F. Antoniadis, Karl W. Jenkins Computers & Fluids . 2018,第期

机译：改进并行非结构化Weno有限卷CFD码的计算性能，用于隐式大涡模拟
3. Finite Element-Node-Centered Finite-Volume Two-Phase-Flow Experiments With Fractured Rock Represented by Unstructured Hybrid-Element Meshes [J] . Stephan K. Matthaei, Andrey Mezentsev, Mandefro Belayneh SPE Reservoir Evaluation & Engineering . 2007,第6期

机译：以非结构化混合单元网格为代表的裂隙岩石有限元节点为中心的有限体积两相流实验
4. Performance of Unstructured Finite Volume Code on a Cluster with Multiple GPUs per Node [C] . Keith Obenschain, Andrew Corrigan, Gopal Patnaik AIAA aerospace sciences meeting including the new horizons forum and aerospace exposition . 2011

机译：每个节点多个GPU的集群上的非结构化有限卷代码的性能
5. Families of control-volume distributed CVD (MPFA) finite volume schemes for the porous medium pressure equation on structured and unstructured grids [D] . Pal, Mayur. 2007

机译：控制体积分布式CVD（MPFA）结构化和非结构化网格上的多孔介质压力方程的有限音量方案
6. Accuracy of compact-stencil interpolation algorithms for unstructured mesh finite volume solver [O] . Adek Tasri, Anita Susilawati 2021

机译：对非结构化网格有限音量求解器的紧凑型模板插值算法的精度
7. A GPU-enabled implicit Finite Volume solver for the ideal two-fluid plasma model on unstructured grids [O] . Isaac Alonso Asensio, Alejandro Alvarez Laguna, Mohamed Hassanine Aissa, 2019

机译：支持GPU的隐式有限音量求解器，用于非结构化网格上的理想双流体等离子体模型
8. Comparison of Node-Centered and Cell-Centered Unstructured Finite-Volume Discretizations: Viscous Fluxes [R] . Diskin, B., Thomas, J. L., Nielsen, E. J., 2010

机译：以节点为中心和以细胞为中心的非结构化有限体积离散的比较：粘性通量

Performance of Unstructured Finite Volume Code on a Cluster with Multiple GPUs per Node

摘要

著录项

相似文献

相关主题

期刊订阅