Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis

机译：Roadline缩放轨迹：一种并行应用和架构性能分析的方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The end of Dennard scaling signaled a shift in HPC supercomputer architectures from systems built from single- core processor architectures to systems built from multicore and eventually manycore architectures. This transition substantially complicated performance optimization and analysis as new programming models were created, new scaling methodologies deployed, and on-chip contention became a bottleneck to performance. Existing distributed memory performance models like logP and logGP were unable to capture this contention. The Roofline model was created to address this contention and its interplay with locality. However, to date, the Roofline model has focused on full-node concurrency. In this paper, we extend the Roofline model to capture the effects of concurrency on data locality and on-chip contention. We demonstrate the value of this new technique by evaluating the NAS parallel benchmarks on both multicore and manycore architectures under both strong-and weak-scaling regimes. In order to quantify the interplay between programming model and locality, we evaluate scaling under both the OpenMP and flat MPI programming models.

机译：Dennard缩放的末尾通过从单核处理器架构构建的系统的HPC超级计算机架构中的转变为来自Multicore和最终多核体系结构构建的系统。这种过渡基本上复杂的性能优化和分析是创建了新的编程模型，部署了新的缩放方法，片上争用成为表现的瓶颈。现有的分布式内存性能模型如logp和loggp无法捕获此争用。创建屋顶模型，以解决此争用及其与地方的相互作用。但是，迄今为止，Royline模型集中在全节点并发上。在本文中，我们扩展了屋顶模型，以捕获并发性对数据位置和片上争用的影响。我们通过在强大和弱缩小制度下评估多核和多核架构的NAS并行基准来展示这种新技术的价值。为了量化编程模型和局部性之间的相互作用，我们在OpenMP和Flat MPI编程模型下评估缩放。

著录项

来源
《International Conference on High Performance Computing and Simulation》|2018年|522p|共9页
会议地点
作者
Khaled Ibrahim; Samuel Williams; Leonid Oliker;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP30-53;
关键词
Trajectory; Concurrent computing; Computational modeling; Multicore processing; Programming; Measurement;

机译：轨迹;并发计算;计算建模;多核处理;编程;测量;

相似文献

外文文献
中文文献
专利

1. Roofline analysis with Cray performance analysis tools (CrayPat) and roofline-based performance projections for a future architecture [J] . JaeHyuk Kwack, Galen Arnold, CelsoMendes, Concurrency, practice and experience . 2019,第16期

机译：使用Cray性能分析工具（CrayPat）进行屋顶线分析，以及基于屋顶线的性能预测以用于未来的体系结构
2. Roofline analysis with Cray performance analysis tools (CrayPat) and roofline-based performance projections for a future architecture [J] . JaeHyuk Kwack, Galen Arnold, CelsoMendes, Concurrency, practice and experience . 2019,第16期

机译：Roofline分析用Cray性能分析工具（Craypat）和基于屋顶的性能预测未来建筑
3. Performance Analysis of Homogeneous On-Chip Large-Scale Parallel Computing Architectures for Data-Parallel Applications [J] . Xiaowen Chen, Zhonghai Lu, Axel Jantsch, Journal of electrical and computer engineering . 2015,第期

机译：数据并行应用程序的同类片上大规模并行计算体系结构的性能分析
4. Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis [C] . Khaled Ibrahim, Samuel Williams, Leonid Oliker International Conference on High Performance Computing Simulation . 2018

机译：车顶线缩放轨迹：一种并行应用和建筑性能分析的方法
5. Performance analysis and modeling of parallel applications in the context of architectural rooflines [D] . Shaila, Nashid 2016

机译：建筑顶线环境下的并行应用程序性能分析和建模
6. Performance of parallel FDTD method for shared- and distributed-memory architectures: Application tobioelectromagnetics [O] . Miguel Ruiz-Cabello N., Maksims Abaļenkovs, Luis M. Diaz Angulo, 2020

机译：共享和分布式内存架构并行FDTD方法的性能：应用脚踏电磁
7. Performance Analysis of Homogeneous On-Chip Large-Scale Parallel Computing Architectures for Data-Parallel Applications [O] . Xiaowen Chen, Zhonghai Lu, Axel Jantsch, 2015

机译：用于数据并行应用的均匀芯片大规模平行计算架构的性能分析
8. Performance and scalability analysis of teraflop-scale parallel architectures using multidimensional wavefront applications [R] . Hoisie, A. , Lubeck, O. , Wasserman, H. 1998

机译：使用多维波前应用的teraflop规模并行架构的性能和可扩展性分析

Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis

摘要

著录项

相似文献

相关主题

期刊订阅