首页> 外文会议>ACM/IEEE conference on Supercomputing >Compiling stencils in high performance Fortran
【24h】

Compiling stencils in high performance Fortran

机译:在高性能Fortran中编辑模具

获取原文
获取外文期刊封面目录资料

摘要

For many Fortran90 and HPF programs performing dense matrix computations, the main computational portion of the program belongs to a class of kernels known as stencils. Stencil computations are commonly used in solving partial differential equations, image processing, and geometric modeling. The efficient handling of such stencils is critical for achieving high performance on distributed-memory machines. Compiling stencils into efficient code is viewed as so important that some companies have built special-purpose compilers for handling them and others have added stencil-recognizers to existing compilers.In this paper we present a general compilation strategy for stencils written using Fortran90 array constructs. Our strategy is capable of optimizing single or multi-statement stencils and is applicable to stencils specified with shift intrinsics or with array-syntax all equally well. The strategy eliminates the need for pattern-recognition algorithms by orchestrating a set of optimizations that address the overhead of both intraprocessor and interprocessor data movement that results from the translation of Fortran90 array constructs. Our experimental results show that code produced by this strategy beats or matches the best code produced by the special-purpose compilers or pattern-recognition schemes that are known to us. In addition, our strategy produces highly optimized code in situations where the others fail, producing several orders of magnitude performance improvement, and thus provides a stencil compilation strategy that is more robust than its predecessors.
机译:对于许多执行密集矩阵计算的Fortran90和HPF程序,该程序的主要计算部分属于一类称为模板的内核。模具计算通常用于求解偏微分方程,图像处理和几何建模。有效地处理此类模板对于在分布式内存机器上实现高性能至关重要。将模板编译为高效代码被认为是非常重要的,以至于一些公司已经建立了专用的编译器来处理它们,而另一些公司则向现有的编译器中添加了模板识别器。在本文中,我们提出了使用Fortran90数组构造编写的模板的通用编译策略。我们的策略能够优化单语句或多语句模板,并且同样适用于使用移位内在函数或数组语法指定的模板。该策略通过编排一组优化程序来消除对模式识别算法的需要,这些优化程序解决了因Fortran90数组构造的转换而导致的处理器内和处理器间数据移动的开销。我们的实验结果表明,这种策略产生的代码优于或匹配由我们所熟知的专用编译器或模式识别方案产生的最佳代码。此外,我们的策略可以在其他代码失败的情况下生成高度优化的代码,从而将性能提高几个数量级,因此可以提供比其前身更强大的模板编译策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号