首页> 外文期刊>Procedia Computer Science >Towards a Multi-Level Cache Performance Model for 3D Stencil Computation
【24h】

Towards a Multi-Level Cache Performance Model for 3D Stencil Computation

机译:迈向用于3D模具计算的多级缓存性能模型

获取原文

摘要

It is crucial to optimize stencil computations since they are the core (and most computational demanding segment) of many Scientific Computing applications, therefore reducing overall execution time. This is not a simple task, actually it is lengthy and tedious. It is lengthy because the large number of stencil optimizations combinations to test, which might consume days of computing time, and the process is tedious due to the slightly different versions of code to implement. Alternatively, models that predict performance can be built without any actual stencil execution, thus reducing the cumbersome optimization task. Previous works have proposed cache misses and execution time models for specific stencil optimizations. Furthermore, most of them have been designed for 2D datasets or stencil sizes that only suit low order numerical schemes. We propose a flexible and accurate model for a wide range of stencil sizes up to high order schemes, that captures the behavior of 3D stencil computations using platform parameters. The model has been tested in a group of representative hardware architectures, using realistic dataset sizes. Our model predicts successfully stencil execution times and cache misses. However, predictions accuracy depends on the platform, for instance on x86 architectures prediction errors ranges between 1-20%. Therefore, the model is reliable and can help to speed up the stencil computation optimization process. To that end, other stencil optimization techniques can be added to this model, thus essentially providing a framework which covers most of the state-of-the-art.
机译:优化模板计算至关重要,因为它们是许多科学计算应用程序的核心(也是最需要计算的部分),因此减少了总体执行时间。这不是一个简单的任务,实际上是冗长而乏味的。之所以冗长,是因为要测试大量的模板优化组合,这可能会花费数天的计算时间,而且由于要实现的代码版本略有不同,因此该过程很繁琐。或者,可以构建预测性能的模型而无需执行任何实际的模具,从而减少了繁琐的优化任务。先前的工作提出了针对特定模板优化的缓存未命中和执行时间模型。此外,它们中的大多数已针对仅适合低阶数值方案的2D数据集或模具尺寸进行了设计。我们为各种模板尺寸(直至高阶方案)提出了一种灵活而准确的模型,该模型使用平台参数捕获3D模板计算的行为。该模型已使用实际的数据集大小在一组代表性的硬件体系结构中进行了测试。我们的模型可以成功预测模具执行时间和缓存未命中率。但是,预测准确性取决于平台,例如,基于x86架构,预测误差范围为1-20%。因此,该模型是可靠的,可以帮助加快模板计算优化过程。为此,可以将其他模版优化技术添加到此模型中,从而从本质上提供涵盖大多数最新技术的框架。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号