Towards Compiler-Agnostic Performance in Finite-Difference Codes

机译：在有限差分码中朝着编译器 - 不可知性能

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we evaluate the performance implications of applying a technique which we call PSyKAl to finite difference Ocean models. In PSyKAl the code related to the underlying science is formally separated from code related to parallelisation and single core optimisations. This separation of concerns allows scientists to code their science independently of the underlying hardware architecture (thereby keeping a single code base) and for optimisation specialists to be able to tailor the code for a particular machine independently of the science code. A finite difference shallow water benchmark optimised for cache-based architectures is taken as the starting point. A vanilla PSyKAl version is written and the performance of the two compared. The optimisations that were applied to the original benchmark (loop fusion etc.) are then manually applied to the PSyKAl version as a set of code modifications to the optimisation layer. Performance results are presented for the Cray, Intel and GNU compilers on Intel Ivybridge and Haswell processors and for the IBM compiler on Power8. Results show that the combined set of code modifications obtain performance that is within a few percent of the original code for all compiler and architecture combinations on all tested problem sizes. The only exception to this (other than where we see performance improvement) is the Gnu compiler on Haswell for one problem size. Our tests indicate that this may be due to immature support for that architecture in the Gnu compiler - no such problem is seen on the Ivy Bridge system. Further, the original code performed poorly using the IBM compiler on Power8 and needed to be modified to obtain performant code. Therefore, the PSyKAl approach can be used with negligible performance loss and sometimes small performance gains compared to the original optimised code. We also find that there is no single best hand-optimised implementation of the code for all of the compilers tested.

机译：在本文中，我们评估应用我们称PSYKAL调用PSYKAL以有限差异海洋模型的技能影响。在psykal中，与底层科学相关的代码与与平行化和单核优化相关的代码分开。这种关注的分离允许科学家独立于底层硬件架构（维持单个代码基础）和优化专家来编写他们的科学家，以便能够独立于科学代码来定制特定机器的代码。针对基于缓存的架构优化的有限差分浅水基准作为起点。编写了一个香草psykal版本，比较了两者的性能。然后将应用于原始基准（环融合等）的优化作为一组代码修改手动应用于优化层的代码修改。在英特尔Ivybridge和Haswell处理器上的CRAY，Intel和GNU编译器以及POWER中的IBM编译器提供了绩效结果。结果表明，组合的代码修改集获得了所有在所有测试问题大小的所有编译器和架构组合的原始代码的百分比内的性能。唯一的例外（除了我们看到性能改进的地方）是一个问题大小的GNU编译器。我们的测试表明，这可能是由于GNU编译器中对该架构的不成熟支持 - 在常春藤桥系统上没有看到这样的问题。此外，原始代码使用POWE1上的IBM编译器执行不足，并且需要修改以获取执行性代码。因此，与原始优化代码相比，PSYKAL方法可以与可忽略的性能损失和有时小的性能增益一起使用。我们还发现，对于所有测试的所有编译器，没有单一的单一最佳手工优化实现。

著录项

来源
《International Conference series on Parallel Computing》|2016年|xx 850 pages :|共12页
会议地点
作者
A. R. PORTER; R. W. FORD; M. ASHWORTH; G. D. RILEY; M. MODANI;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP338.6-532;
关键词
Performance; Code-generation; Finite-difference;

机译：性能;代码生成;有限差异;

相似文献

外文文献
中文文献
专利

1. Portable multi- and many-core performance for finite-difference or finite-element codes – application to the free-surface component of NEMO (NEMOLite2D 1.0) [J] . Porter Andrew R., Appleyard Jeremy, Ashworth Mike, Geoscientific Model Development . 2018,第8期

机译：有限差分或有限元代码的便携式多核和多核性能–应用于NEMO（NEMOLite2D 1.0）的自由表面组件
2. Portable multi- and many-core performance for finite-difference or finite-element codes – application to the free-surface component of NEMO (NEMOLite2D 1.0) [J] . Porter Andrew R., Appleyard Jeremy, Ashworth Mike, Geoscientific Model Development . 2018,第8期

机译：有限差分或有限元代码的便携式多核和多核性能–应用于NEMO（NEMOLite2D 1.0）的自由表面组件
3. Portable multi- and many-core performance for finite-difference or finite-element codes – application to the free-surface component of NEMO (NEMOLite2D 1.0) [J] . Porter Andrew R., Appleyard Jeremy, Ashworth Mike, Geoscientific Model Development Discussions . 2018,第8期

机译：有限差分或有限元代码的便携式多核性能 - 适用于Nemo的自由表面分量（Nemolite2d 1.0）
4. Towards Compiler-Agnostic Performance in Finite-Difference Codes [C] . A. R. PORTER, R. W. FORD, M. ASHWORTH, International Conference series on Parallel Computing . 2016

机译：在有限差分码中朝着编译器 - 不可知性能
5. A fourth-order symplectic finite-difference time-domain (FDTD) method for light scattering and a three-dimensional Monte Carlo code for radiative transfer in scattering systems. [D] . Zhai, Pengwang. 2006

机译：用于光散射的四阶辛有限差分时域（FDTD）方法和用于散射系统中辐射传递的三维蒙特卡罗代码。
6. Performance of Nonlinear Finite-Difference Poisson-Boltzmann Solvers [O] . Qin Cai, Meng-Juei Hsieh, Jun Wang, -1

机译：非线性有限差分泊松 - 玻耳兹曼解算器的性能
7. Portable multi- and many-core performance for finite-difference or finite-element codes – application to the free-surface component of NEMO (NEMOLite2D 1.0) [O] . Andrew R. Porter, Jeremy Appleyard, Mike Ashworth, 2018

机译：有限差分或有限元代码的便携式多核性能 - 适用于Nemo的自由表面分量（Nemolite2d 1.0）
8. Interface of a Finite-Difference Time-Domain Electromagnetics Code with a Linear Transmission-Line/Circuit Code. [R] . Turner, C. D., Riley, D. J., Bacon, L. D. 1989

机译：有限差分时域电磁编码与线性传输线/电路编码的接口。

Towards Compiler-Agnostic Performance in Finite-Difference Codes

摘要

著录项

相似文献

相关主题

期刊订阅