首页> 外文期刊>Procedia Computer Science >A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures
【24h】

A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures

机译:用于多核和多核处理器体系结构的可移植OpenCL莱迪思Boltzmann代码

获取原文
获取外文期刊封面目录资料

摘要

The architecture of high performance computing systems is becoming more and more heterogeneous, as accelerators play an increasingly important role alongside traditional CPUs. Programming heterogeneous systems efficiently is a complex task, that often requires the use of specific programming environments. Programming frameworks supporting codes portable across different high performance architectures have recently appeared, but one must carefully assess the relative costs of portability versus computing efficiency, and find a reasonable tradeoff point. In this paper we address precisely this issue, using as test-bench a Lattice Boltzmann code implemented in OpenCL. We analyze its performance on several different state-of-the-art processors: NVIDIA GPUs and Intel Xeon-Phi many-core accelerators, as well as more traditional Ivy Bridge and Opteron multi-core commodity CPUs. We also compare with results obtained with codes specifically optimized for each of these systems. Our work shows that a properly structured OpenCL code runs on many different systems reaching performance levels close to those obtained by architecture-tuned CUDA or C codes.
机译:随着加速器与传统CPU一起发挥越来越重要的作用,高性能计算系统的体系结构变得越来越异构。有效地对异构系统进行编程是一项复杂的任务,通常需要使用特定的编程环境。最近出现了支持跨不同高性能体系结构移植代码的编程框架,但是必须仔细评估可移植性与计算效率的相对成本,并找到一个合理的折衷点。在本文中,我们使用OpenCL中实现的莱迪思Boltzmann代码作为测试平台来精确解决此问题。我们在几种不同的最新处理器上分析其性能:NVIDIA GPU和Intel Xeon-Phi多核加速器,以及更传统的Ivy Bridge和Opteron多核商用CPU。我们还将与针对每个系统专门优化的代码所获得的结果进行比较。我们的工作表明,结构正确的OpenCL代码可在许多不同的系统上运行,其性能水平接近体系结构调整的CUDA或C代码所获得的性能水平。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号