首页> 外文会议>International Conference on High Performance Computing Simulation >Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis
【24h】

Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis

机译:车顶线缩放轨迹:一种并行应用和建筑性能分析的方法

获取原文

摘要

The end of Dennard scaling signaled a shift in HPC supercomputer architectures from systems built from single- core processor architectures to systems built from multicore and eventually manycore architectures. This transition substantially complicated performance optimization and analysis as new programming models were created, new scaling methodologies deployed, and on-chip contention became a bottleneck to performance. Existing distributed memory performance models like logP and logGP were unable to capture this contention. The Roofline model was created to address this contention and its interplay with locality. However, to date, the Roofline model has focused on full-node concurrency. In this paper, we extend the Roofline model to capture the effects of concurrency on data locality and on-chip contention. We demonstrate the value of this new technique by evaluating the NAS parallel benchmarks on both multicore and manycore architectures under both strong-and weak-scaling regimes. In order to quantify the interplay between programming model and locality, we evaluate scaling under both the OpenMP and flat MPI programming models.
机译:Dennard扩展的结束标志着HPC超级计算机体系结构已从单核处理器体系结构构建的系统转变为多核体系结构,最终是多核体系结构构建的系统。随着创建新的编程模型,部署新的扩展方法以及片上争用成为性能瓶颈,这种过渡使性能优化和分析变得非常复杂。现有的分布式内存性能模型(例如logP和logGP)无法捕获此争用。创建了Roofline模型来解决该争用及其与局部的相互作用。但是,迄今为止,Roofline模型已经集中在全节点并发上。在本文中,我们扩展了Roofline模型,以捕获并发对数据局部性和片上竞争的影响。通过评估强扩展和弱扩展体制下的多核和多核架构上的NAS并行基准测试,我们证明了这项新技术的价值。为了量化编程模型和局部性之间的相互作用,我们评估了OpenMP和平面MPI编程模型下的缩放比例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号