...
首页> 外文期刊>ACM transactions on mathematical software >Automated Tiling of Unstructured Mesh Computations with Application to Seismological Modeling
【24h】

Automated Tiling of Unstructured Mesh Computations with Application to Seismological Modeling

机译:非结构网格计算的自动切片及其在地震学建模中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Sparse tiling is a technique to fuse loops that access common data, thus increasing data locality. Unlike traditional loop fusion or blocking, the loops may have different iteration spaces and access shared datasets through indirect memory accesses, such as A [map[i]]-hence the name "sparse." One notable example of such loops arises in discontinuous-Galerkin finite element methods, because of the computation of numerical integrals over different domains (e.g., cells, facets). The major challenge with sparse tiling is implementation-not only is it cumbersome to understand and synthesize, but it is also onerous to maintain and generalize, as it requires a complete rewrite of the bulk of the numerical computation. In this article, we propose an approach to extend the applicability of sparse tiling based on raising the level of abstraction. Through a sequence of compiler passes, the mathematical specification of a problem is progressively lowered, and eventually sparse-tiled C for-loops are generated. Besides automation, we advance the state-of-the-art by introducing a revisited, more efficient sparse tiling algorithm; support for distributed-memory parallelism; a range of fine-grained optimizations for increased runtime performance; implementation in a publicly available library, SLOPE; and an in-depth study of the performance impact in Seigen, a real-world elastic wave equation solver for seismological problems, which shows speed-ups up to 1.28x on a platform consisting of 896 Intel Broadwell cores.
机译:稀疏切片是一种融合访问公共数据的循环的技术,从而增加了数据的局部性。与传统的循环融合或阻塞不同,循环可能具有不同的迭代空间,并且可以通过间接内存访问(例如A [map [i]])访问共享数据集,因此名称为“稀疏”。间断Galerkin有限元方法,因为计算了不同域(例如,单元,面)上的数值积分。稀疏平铺的主要挑战是实现-不仅难以理解和综合,而且维护和概括也很麻烦,因为它需要完全重写大量的数值计算。在本文中,我们提出了一种在提高抽象水平的基础上扩展稀疏切片的适用性的方法。通过一系列编译器遍历,逐渐降低了问题的数学规格,最终生成了稀疏的C for循环。除了自动化之外,我们还通过引入一种重新审视,更有效的稀疏平铺算法来提高技术水平;支持分布式内存并行性;一系列细粒度的优化,以提高运行时性能;在可公开获得的图书馆SLOPE中实施;以及对Seigen的性能影响的深入研究,Seigen是现实世界中的地震学问题的弹性波方程求解器,在包含896个Intel Broadwell内核的平台上显示的速度提高了1.28倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号