首页> 外文会议>Euromicro International Conference on Parallel, Distributed and Network-Based Processing >Evaluation of Successive CPUs/APUs/GPUs Based on an OpenCL Finite Difference Stencil
【24h】

Evaluation of Successive CPUs/APUs/GPUs Based on an OpenCL Finite Difference Stencil

机译:基于OpenCL有限差分模板的连续CPU / APU / GPU的评估

获取原文

摘要

The AMD APU (Accelerated Processing Unit) architecture, which combines CPU and GPU cores on the same die, is promising for GPU applications which performance is bottlenecked by the low PCI Express communication rate. However the first APU generations still have different CPU and GPU memory partitions. Currently, the APU integrated GPUs are also less powerful than discrete GPUs. In this paper we therefore investigate the interest of APUs for scientific computing by evaluating and comparing the performance of two successive AMD APUs (family codename Llano and Trinity), two successive discrete GPUs (chip codename Cayman and Tahiti) and one hexa-core AMD CPU. For this purpose, we rely on a 3D finite difference stencil, that is optimized and tuned in OpenCL. We detail the most interesting optimizations for each architecture and show very good performance in OpenCL: up to 500 Gflops on Tahiti. Finally, our results show that APU integrated GPUs outperform CPUs, and that integrated GPUs of upcoming APUs may match discrete GPUs for problems with high communication requirements.
机译:AMD APU(加速处理单元)架构在同一芯片上结合了CPU和GPU内核,因此对于低PCI Express通信速率造成性能瓶颈的GPU应用而言,是有希望的。但是,第一代APU仍然具有不同的CPU和GPU内存分区。当前,集成APU的GPU还不如离散GPU强大。因此,在本文中,我们通过评估和比较两个连续的AMD APU(系列代号Llano和Trinity),两个连续的离散GPU(芯片代号Cayman和Tahiti)以及一个六核AMD CPU的性能来研究APU在科学计算中的兴趣。 。为此,我们依赖于3D有限差分模板,该模板在OpenCL中进行了优化和调整。我们详细介绍了每种体系结构最有趣的优化,并在OpenCL中显示了非常好的性能:在塔希提岛上高达500 Gflops。最后,我们的结果表明,APU集成GPU的性能优于CPU,而即将推出的APU的集成GPU可能会与离散GPU相匹配,以解决通信需求较高的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号