首页> 外文会议>25th international conference on parallel computational fluid dynamics 2013 >OpenCL implementation of basic operations for a high-order finite-volume polynomial scheme on unstructured hybrid meshes
【24h】

OpenCL implementation of basic operations for a high-order finite-volume polynomial scheme on unstructured hybrid meshes

机译:非结构混合网格上高阶有限体积多项式方案的基本运算的OpenCL实现

获取原文
获取原文并翻译 | 示例

摘要

A parallel finite-volume algorithm based on a cell-centered high-order polynomial scheme for unstructured hybrid meshes is under consideration. The work is focused on the adaptation and optimization of basic operations of the algorithm to different architectures of massively-parallel accelerators including GPU of AMD and NVIDIA. Such an algorithm is especially problematic for the GPU architectures since it has very low FLOP per byte ratio meaning that performance is dominated by the memory bandwidth but not the computing performance of a device. At the same time it has irregular memory access pattern since unstructured meshes are used. The calculation of polynomial coefficients and the calculation of convective fluxes through faces of cells are the most interesting and time consuming operations of the algorithm. Implementations of these operations for accelerators using OpenCL are considered here in detail. The ways to improve the computational efficiency are proposed, performance measurement results reaching up to 160 GFLOPS on a single GPU device are demonstrated.
机译:正在考虑基于单元中心高阶多项式格式的非结构化混合网格并行有限体积算法。这项工作专注于算法的基本操作适应和优化以适应大规模并行加速器(包括AMD和NVIDIA GPU)的不同体系结构。这样的算法对于GPU体系结构尤其成问题,因为它具有非常低的每字节FLOP比率,这意味着性能受内存带宽的支配,而不是设备的计算性能。同时,由于使用了非结构化的网格,它具有不规则的内存访问模式。多项式系数的计算和通过像元面的对流通量的计算是该算法最有趣,最耗时的操作。此处详细考虑了使用OpenCL的加速器的这些操作的实现。提出了提高计算效率的方法,并演示了在单个GPU设备上达到160 GFLOPS的性能测量结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号