首页> 外文学位 >Exploring multiple levels of performance modeling for heterogeneous systems.
【24h】

Exploring multiple levels of performance modeling for heterogeneous systems.

机译:探索异构系统的多级性能建模。

获取原文
获取原文并翻译 | 示例

摘要

The current trend in High-Performance Computing (HPC) is to extract concurrency from clusters that include heterogeneous resources such as General Purpose Graphical Processing Units (GPGPUs) and Field Programmable Gate Array (FPGAs). Although these heterogeneous systems can provide substantial performance for massively parallel applications, much of the available computing resources are often under-utilized due to inefficient application mapping, load balancing, and tuning. While several performance prediction models exist to efficiently tune applications, they often require significant computing architecture knowledge for reliable prediction. In addition, they do not address multiple levels of design space abstraction and it is often difficult to choose a reliable prediction model for a given design. In this research, we develop a multi-level suite of performance prediction models for heterogeneous systems that primarily targets Synchronous Iterative Algorithms (SIAs). The modeling suite aims to produce accurate and straightforward application runtime prediction prior to the actual large-scale implementation. This suite addresses two levels of system abstraction: 1) low-level where partial knowledge of the application implementation is present along with the system specifications and 2) high-level where the implementation details are minimum and only high-level computing system specifications are given. The performance prediction modeling suite is developed using our proposed Synchronous Iterative GPGPU Execution (SIGE) model for GPGPU clusters, motivated by the RC Amenability Test for Scalable Systems (RATSS) model for FPGA clusters. The low-level abstraction for GPGPU clusters consists of a regression-based performance prediction framework that statistically abstracts system architecture characteristics, enabling performance prediction without detailed architecture knowledge. In this framework, the overall execution time of an application is predicted using regression models developed for host-device computations and network-level communications performed in the algorithm. We have used a family of Spiking Neural Network (SNN) models and an Anisotropic Diffusion Filter (ADF) algorithm as SIA case studies for verification of the regression-based framework and achieved over 90% prediction accuracy compared to the actual implementations for several GPGPU cluster configurations tested. The results establish the adequacy of the low-level abstraction model for advanced, fine-grained performance prediction and design space exploration (DSE). The high-level abstraction consists of the following two primary modeling approaches: qualitative modeling that uses existing subjective-analytical models for computation and communication; and quantitative modeling that predicts computation and communication performance by measuring hardware events associated with objective-analytical models using micro-benchmarks. The performance prediction provided by the high-level abstraction approaches, albeit coarse-grained, delivers useful insight into application performance on the chosen heterogeneous system. A blend of the two high-level modeling approaches, labeled as hybrid modeling, is explored for insightful preliminary performance prediction. The performance prediction models in the multi-level suite are verified and compared for their accuracy and ease-of-use, allowing developers to choose a model that best satisfies their design space abstraction. We also construct a roadmap that guides user from optimal Application-to-Accelerator (A2A) mapping to fine-grained performance prediction, thereby providing a hierarchical approach to optimal application porting on the target heterogeneous system. The end goal of this dissertation research is to offer the HPC community a thorough, non-architecture specific, performance prediction framework in the form of a hierarchical modeling suite that enables them to optimally utilize the heterogeneous resources.
机译:高性能计算(HPC)的当前趋势是从包含异构资源(例如通用图形处理单元(GPGPU)和现场可编程门阵列(FPGA))的集群中提取并发性。尽管这些异构系​​统可以为大规模并行应用程序提供实质性的性能,但是由于无效的应用程序映射,负载平衡和调整,许多可用的计算资源经常未被充分利用。尽管存在几种性能预测模型可以有效地调整应用程序,但它们通常需要大量的计算体系结构知识才能进行可靠的预测。此外,它们不能解决设计空间抽象的多个级别,并且通常很难为给定的设计选择可靠的预测模型。在这项研究中,我们为异构系统开发了一套多层次的性能预测模型套件,主要针对同步迭代算法(SIA)。该建模套件旨在在实际大规模实施之前产生准确而直接的应用程序运行时预测。该套件解决了系统抽象的两个级别:1)低级别,其中包含对应用程序实现的部分知识以及系统规范; 2)高级别,其中实现的详细信息最少,仅给出了高级计算系统规范。性能预测建模套件是使用我们针对GPGPU群集提出的同步迭代GPGPU执行(SIGE)模型开发的,该模型是由针对FPGA群集的RC可扩展性系统可扩展性测试(RATSS)模型驱动的。 GPGPU集群的低层抽象由基于回归的性能预测框架组成,该框架统计地抽象了系统架构特征,从而无需详细的架构知识即可进行性能预测。在此框架中,使用为主机设备计算和算法中执行的网络级通信开发的回归模型来预测应用程序的总执行时间。我们已使用Spiking神经网络(SNN)模型家族和各向异性扩散过滤器(ADF)算法作为SIA案例研究来验证基于回归的框架,与多个GPGPU集群的实际实现相比,预测精度达到了90%以上配置测试。结果确定了用于高级,细粒度性能预测和设计空间探索(DSE)的低层抽象模型的充分性。高层抽象包括以下两种主要的建模方法:使用现有的主观分析模型进行计算和交流的定性建模;定量建模可通过使用微基准测量与客观分析模型相关的硬件事件来预测计算和通信性能。高级抽象方法提供的性能预测(尽管是粗粒度的)可提供对选定异构系统上应用程序性能的有用见解。探索了两种高级建模方法(称为混合建模)的混合,以进行深入的初步性能预测。验证并比较了多级套件中的性能预测模型的准确性和易用性,从而使开发人员可以选择最能满足其设计空间抽象要求的模型。我们还构建了一个路线图,指导用户从最佳应用程序到加速器(A2A)映射到细粒度的性能预测,从而为在目标异构系统上的最佳应用程序移植提供了一种分层方法。本论文研究的最终目的是以分层建模套件的形式为HPC社区提供一个完整的,非体系结构特定的性能预测框架,使他们能够最佳地利用异构资源。

著录项

  • 作者单位

    Clemson University.;

  • 授予单位 Clemson University.;
  • 学科 Engineering Computer.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 232 p.
  • 总页数 232
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号