首页> 外文会议>International supercomputing conference >On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures

【24h】

On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures

机译：多核计算机体系结构上的结构化网格代码的性能可移植性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the advent of many-core computer architectures such as GPGPUs from NVIDIA and AMD, and more recently Intel's Xeon Phi, ensuring performance portability of HPC codes is potentially becoming more complex. In this work we have focused on one important application area - structured grid codes - and investigated techniques for ensuring performance portability across a diverse range of different, high-end many-core architectures. We chose three codes to investigate: a 3D lattice Boltzmann code (D3Q19 BGK), the CloverLeaf hydrodynamics mini application from Sandia's Mantevo benchmark suite, and ROTORSIM, a production-quality structured grid, multiblock, compressible finite-volume CFD code. We have developed OpenCL versions of these codes in order to provide cross-platform functional portability, and compared the performance of the OpenCL versions of these structured grid codes to optimized versions on each platform, including hybrid OpenMP/MPI/AVX versions on CPUs and Xeon Phi, and CUDA versions on NVIDIA GPUs. Our results show that, contrary to conventional wisdom, using OpenCL it is possible to achieve a high degree of performance portability, at least for structured grid applications, using a set of straightforward techniques. The performance portable code in OpenCL is also highly competitive with the best performance using the native parallel programming models on each platform.

机译：随着许多核心计算机体系结构的出现，例如NVIDIA和AMD的GPGPU，以及最近的英特尔的Xeon Phi，确保HPC代码的性能可移植性可能变得越来越复杂。在这项工作中，我们集中于一个重要的应用领域-结构化网格代码-并研究了确保跨各种不同的高端多核架构实现性能可移植性的技术。我们选择了三种代码进行研究：3D格子Boltzmann代码（D3Q19 BGK），来自Sandia的Mantevo基准套件的CloverLeaf流体动力学微型应用程序以及ROTORSIM，这是一种生产质量的结构化网格，多块可压缩有限体积CFD代码。为了提供跨平台的功能可移植性，我们已经开发了这些代码的OpenCL版本，并将这些结构化网格代码的OpenCL版本的性能与每个平台上的优化版本进行了比较，包括CPU和Xeon上的混合OpenMP / MPI / AVX版本NVIDIA GPU上的Phi和CUDA版本。我们的结果表明，与传统观点相反，使用OpenCL可以使用一组简单的技术至少在结构化网格应用程序中实现高度的性能可移植性。使用每个平台上的本机并行编程模型，OpenCL中的性能可移植代码在最佳性能方面也极具竞争力。

著录项

来源
《International supercomputing conference》|2012年|53-75|共23页
会议地点
作者
Simon McIntosh-Smith; Michael Boulton; Dan Curran; James Price;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Many-core; heterogeneous; GPU; Xeon Phi; structured grid; multi-grid multi-block; lattice Boltzmann;

机译：多核;异质; GPU;至强皮;结构化网格;多网格多块格子玻尔兹曼;

相似文献

外文文献
中文文献
专利

1. Performance analysis of a 3D unstructured mesh hydrodynamics code on multi-core and many-core architectures [J] . Waltz J., Wohlbier J. G., Risinger L. D., International Journal for Numerical Methods in Fluids . 2015,第6期

机译：多核和多核体系结构上的3D非结构化网格流体力学代码的性能分析
2. Portable multi- and many-core performance for finite-difference or finite-element codes – application to the free-surface component of NEMO (NEMOLite2D 1.0) [J] . Porter Andrew R., Appleyard Jeremy, Ashworth Mike, Geoscientific Model Development . 2018,第8期

机译：有限差分或有限元代码的便携式多核和多核性能–应用于NEMO（NEMOLite2D 1.0）的自由表面组件
3. Portable multi- and many-core performance for finite-difference or finite-element codes – application to the free-surface component of NEMO (NEMOLite2D 1.0) [J] . Porter Andrew R., Appleyard Jeremy, Ashworth Mike, Geoscientific Model Development . 2018,第8期

机译：有限差分或有限元代码的便携式多核和多核性能–应用于NEMO（NEMOLite2D 1.0）的自由表面组件
4. On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures [C] . Simon McIntosh-Smith, Michael Boulton, Dan Curran, ISC 2014 . 2014

机译：关于许多核心计算机架构上结构化网格码的性能可移植性
5. Memory optimization in codelet execution model on many-core architectures [D] . Wu, Yao 2014

机译：许多核心架构上的Codelet执行模型中的内存优化
6. High-Performance 3D Compressive Sensing MRI Reconstruction Using Many-Core Architectures [O] . Daehyun Kim, Joshua Trzasko, Mikhail Smelyanskiy, 2011

机译：使用多核架构的高性能3D压缩传感MRI重建
7. Reviewing the Computational Performance of Structured and Unstructured Grid Deterministic SN Transport Sweeps on Many-Core Architectures [O] . Tom Deakin, Simon McIntosh-Smith, Justin Lovegrove, 2020

机译：审查结构化和非结构化网格的计算性能确定型Sn运输在许多核心架构上扫描

On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures

摘要

著录项

相似文献

相关主题

期刊订阅