Faithful performance prediction of a dynamic task-based runtime system for heterogeneous multi-core architectures

Luka Stanisic; Samuel Thibault; Arnaud Legrand; Brice Videau; Jean-Francois Méhaut

首页> 外文期刊>Concurrency and computation: practice and experience >Faithful performance prediction of a dynamic task-based runtime system for heterogeneous multi-core architectures

【24h】

Faithful performance prediction of a dynamic task-based runtime system for heterogeneous multi-core architectures

机译：异构多核体系结构基于动态任务的运行时系统的忠实性能预测

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Multi-core architectures comprising several graphics processing units (GPUs) have become mainstream inrnthe field of high-performance computing. However, obtaining the maximum performance of such heterogeneousrnmachines is challenging as it requires to carefully off-load computations and manage data movementsrnbetween the different processing units. The most promising and successful approaches so far build onrntask-based runtimes that abstract the machine and rely on opportunistic scheduling algorithms. As a consequence,rnthe problem gets shifted to choosing the task granularity, task graph structure, and optimizing thernscheduling strategies. Trying different combinations of these different alternatives is also itself a challenge.rnIndeed, obtaining accurate measurements requires reserving the target system for the whole duration ofrnexperiments. Furthermore, observations are limited to the few available systems at hand and may be difficultrnto generalize. In this article, we show how we crafted a coarse-grain hybrid simulation/emulation of StarPU,rna dynamic runtime for hybrid architectures, over SimGrid, a versatile simulator of distributed systems.rnThis approach allows to obtain performance predictions of classical dense linear algebra kernels accuraternwithin a few percents and in a matter of seconds, which allows both runtime and application designers tornquickly decide which optimization to enable or whether it is worth investing in higher-end graphics processingrnunits or not. Additionally, it allows to conduct robust and extensive scheduling studies in a controlledrnenvironment whose characteristics are very close to real platforms while having reproducible behavior.

机译：包含多个图形处理单元（GPU）的多核体系结构已成为高性能计算领域的主流。然而，获得这种异构机器的最大性能是有挑战性的，因为它需要仔细卸载计算并管理不同处理单元之间的数据移动。迄今为止，最有前途和最成功的方法是建立基于任务的运行时，该运行时将机器抽象并依赖机会调度算法。结果，问题转向了选择任务粒度，任务图结构和优化调度策略。尝试将这些不同的替代方案进行不同的组合本身也是一个挑战。实际上，获得准确的测量值需要在整个实验过程中保留目标系统。此外，观察仅限于手头的几个可用系统，可能难以概括。在本文中，我们展示了如何在分布式系统的通用模拟器SimGrid上精心设计StarPU的粗粒度混合仿真/仿真，用于混合架构的rna动态运行时。这种方法可以获取经典密集线性代数内核的性能预测精确度在几分之一秒之内，这使运行时和应用程序设计人员都可以迅速决定启用哪种优化，或者是否值得在高端图形处理单元上进行投资。此外，它允许在受控环境中进行功能强大且广泛的调度研究，该环境的特征与真实平台非常接近，并且具有可复制的行为。

著录项

来源
《Concurrency and computation: practice and experience》 |2015年第16期|4075–4090|共1页
作者
Luka Stanisic; Samuel Thibault; Arnaud Legrand; Brice Videau; Jean-Francois Méhaut;
展开▼
作者单位

CNRS/Inria, University of Grenoble, Grenoble, France;

Inria, University of Bordeaux, Talence, France;

CNRS/Inria, University of Grenoble, Grenoble, France;

CNRS/Inria, University of Grenoble, Grenoble, France;

CNRS/Inria, University of Grenoble, Grenoble, France;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
HPC; runtimes; simulations;

机译：HPC;运行时;模拟;

相似文献

外文文献
中文文献
专利

1. Hera-JVM: a runtime system for heterogeneous multi-core architectures [J] . McIlroy Ross, Sventek Joe ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2010,第10期

机译：Hera-JVM：用于异构多核体系结构的运行时系统
2. Hera-JVM: a runtime system for heterogeneous multi-core architectures [J] . McIlroy Ross, Sventek Joe ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2010,第10期

机译：Hera-JVM：用于异构多核体系结构的运行时系统
3. Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems [J] . Filip Blagojevic, Dimitrios S. Nikolopoulos, Alexandros Stamatakis, Parallel Computing . 2007,第10a11期

机译：基于加速器的多核系统上动态并行的运行时调度
4. Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-core Architectures [C] . Luka Stanisic, Samuel Thibault, Arnaud Legrand, International conference on Euro-Par . 2014

机译：异构多核体系结构基于动态任务的运行时系统的建模与仿真
5. Intelligent text recognition system on a heterogeneous multi-core processor cluster: A performance profile and architecture exploration. [D] . Ritholtz, Lee. 2009

机译：异构多核处理器集群上的智能文本识别系统：性能概况和体系结构探索。
6. Learning-Directed Dynamic Voltage and Frequency Scaling Scheme with Adjustable Performance for Single-Core and Multi-Core Embedded and Mobile Systems [O] . Yen-Lin Chen, Ming-Feng Chang, Chao-Wei Yu, 2018

机译：具有学习性能的学习型动态电压和频率缩放方案适用于单核和多核嵌入式和移动系统
7. Faithful performance prediction of a dynamic task-based runtime system for heterogeneous multi-core architectures [O] . Luka Stanisic, Samuel Thibault, Arnaud Legrand, 2015

机译：基于动态任务的运行时系统的忠实性能预测异构多核架构

Faithful performance prediction of a dynamic task-based runtime system for heterogeneous multi-core architectures

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅