Towards a Multi-Level Cache Performance Model for 3D Stencil Computation

Ràul de la Cruz; Mauricio Araya-Polo

首页> 外文期刊>Procedia Computer Science >Towards a Multi-Level Cache Performance Model for 3D Stencil Computation

【24h】

Towards a Multi-Level Cache Performance Model for 3D Stencil Computation

机译：迈向用于3D模具计算的多级缓存性能模型

获取原文

开具论文收录证明 >>

AI期刊论文写作 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

It is crucial to optimize stencil computations since they are the core (and most computational demanding segment) of many Scientific Computing applications, therefore reducing overall execution time. This is not a simple task, actually it is lengthy and tedious. It is lengthy because the large number of stencil optimizations combinations to test, which might consume days of computing time, and the process is tedious due to the slightly different versions of code to implement. Alternatively, models that predict performance can be built without any actual stencil execution, thus reducing the cumbersome optimization task. Previous works have proposed cache misses and execution time models for specific stencil optimizations. Furthermore, most of them have been designed for 2D datasets or stencil sizes that only suit low order numerical schemes. We propose a flexible and accurate model for a wide range of stencil sizes up to high order schemes, that captures the behavior of 3D stencil computations using platform parameters. The model has been tested in a group of representative hardware architectures, using realistic dataset sizes. Our model predicts successfully stencil execution times and cache misses. However, predictions accuracy depends on the platform, for instance on x86 architectures prediction errors ranges between 1-20%. Therefore, the model is reliable and can help to speed up the stencil computation optimization process. To that end, other stencil optimization techniques can be added to this model, thus essentially providing a framework which covers most of the state-of-the-art.

机译：优化模板计算至关重要，因为它们是许多科学计算应用程序的核心（也是最需要计算的部分），因此减少了总体执行时间。这不是一个简单的任务，实际上是冗长而乏味的。之所以冗长，是因为要测试大量的模板优化组合，这可能会花费数天的计算时间，而且由于要实现的代码版本略有不同，因此该过程很繁琐。或者，可以构建预测性能的模型而无需执行任何实际的模具，从而减少了繁琐的优化任务。先前的工作提出了针对特定模板优化的缓存未命中和执行时间模型。此外，它们中的大多数已针对仅适合低阶数值方案的2D数据集或模具尺寸进行了设计。我们为各种模板尺寸（直至高阶方案）提出了一种灵活而准确的模型，该模型使用平台参数捕获3D模板计算的行为。该模型已使用实际的数据集大小在一组代表性的硬件体系结构中进行了测试。我们的模型可以成功预测模具执行时间和缓存未命中率。但是，预测准确性取决于平台，例如，基于x86架构，预测误差范围为1-20％。因此，该模型是可靠的，可以帮助加快模板计算优化过程。为此，可以将其他模版优化技术添加到此模型中，从而从本质上提供涵盖大多数最新技术的框架。

著录项

来源
《Procedia Computer Science》 |2011年第1期|共10页
作者
Ràul de la Cruz; Mauricio Araya-Polo;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
stencil computationHPCcode optimizationhomogeneous multi-coreperformance model;

机译：模具计算HPC代码优化同类多核性能模型;

相似文献

外文文献
中文文献
专利

1. Multi-level spatial and temporal tiling for efficient HPC stencil computation on many-core processors with large shared caches [J] . Charles Yount, Alejandro Duran, Josh Tobin Future generation computer systems . 2019,第MARa期

机译：多级空间和时间分块，可在具有大型共享缓存的多核处理器上进行高效的HPC模具计算
2. An analytical GPU performance model for 3D stencil computations from the angle of data traffic [J] . Su Huayou, Cai Xing, Wen Mei, Journal of supercomputing . 2015,第7期

机译：从数据流量角度进行3D模板计算的GPU分析性能模型
3. A Multi-level Optimization Strategy to Improve the Performance of Stencil Computation [J] . Gauthier Sornet, Fabrice Dupros, Sylvain Jubertie Procedia Computer Science . 2017,第22期

机译：提高模板计算性能的多级优化策略
4. Effective Use of Large High-Bandwidth Memory Caches in HPC Stencil Computation via Temporal Wave-Front Tiling [C] . Charles Yount, Alejandro Duran 2016 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems . 2016

机译：通过时间波前平铺在HPC模具计算中有效使用大型高带宽内存缓存
5. Automatic Performance Tuning of Stencil Computations on Graphics Processing Units [D] . Garvey, Joseph D. 2015

机译：图形处理单元上模板计算的自动性能调整
6. Computational Modeling of Cancer Cachexia [O] . Kevin D. Hall, Vickie E. Baracos -1

机译：恶病质恶病质的计算模型
7. Towards a Multi-Level Cache Performance Model for 3D Stencil Computation [O] . de la Cruz Ràul, Araya-Polo Mauricio 2011

机译：迈向用于3D模具计算的多级缓存性能模型

Towards a Multi-Level Cache Performance Model for 3D Stencil Computation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅