OpenCL Performance Prediction using Architecture-Independent Features

机译：使用与体系结构无关的功能进行OpenCL性能预测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

OpenCL is an attractive programming model for heterogeneous high-performance computing systems, with wide support from hardware vendors and significant performance portability. To support efficient scheduling on HPC systems it is necessary to perform accurate performance predictions for OpenCL workloads on varied compute devices, which is challenging due to diverse computation, communication and memory access characteristics which result in varying performance between devices. The Architecture Independent Workload Characterization (AIWC) tool can be used to characterize OpenCL kernels according to a set of architecture-independent features. This work presents a methodology where AIWC features are used to form a model capable of predicting accelerator execution times. We used this methodology to predict execution times for a set of 37 computational kernels running on 15 different devices representing a broad range of CPU, GPU and MIC architectures. The predictions are highly accurate, differing from the measured experimental run-times by an average of only 1.2%, and correspond to actual execution time mispredictions of 9 ps to 1 sec according to problem size. A previously unencountered code can be instrumented once and the AIWC metrics embedded in the kernel, to allow performance prediction across the full range of modelled devices. The results suggest that this methodology supports correct selection of the most appropriate device for a previously unen- countered code, which is highly relevant to the HPC scheduling setting.

机译：OpenCL是一种用于异构高性能计算系统的有吸引力的编程模型，它得到了硬件供应商的广泛支持和出色的性能可移植性。为了在HPC系统上支持高效的调度，有必要对各种计算设备上的OpenCL工作负载执行准确的性能预测，这由于具有各种计算，通信和内存访问特性（导致设备之间的性能变化）而具有挑战性。可以使用体系结构独立工作量表征（AIWC）工具根据一组与体系结构无关的功能来表征OpenCL内核。这项工作提出了一种方法，其中AIWC功能用于形成能够预测加速器执行时间的模型。我们使用这种方法来预测一组运行在15种不同设备上的37个计算内核的执行时间，这些设备代表了广泛的CPU，GPU和MIC架构。这些预测是高度准确的，与实测实验运行时间的平均差异仅为1.2％，并且根据问题的大小，对应于9 ps至1 sec的实际执行时间错误预测。以前无法遇到的代码可以被检测一次，并且AIWC度量标准可以嵌入到内核中，从而可以在整个建模设备范围内进行性能预测。结果表明，该方法支持针对先前未使用的代码正确选择最合适的设备，这与HPC调度设置高度相关。

著录项

来源
《International Conference on High Performance Computing Simulation》|2018年|561-569|共9页
会议地点 Orleans(FR)
作者
Beau Johnston; Gregory Falzon; Josh Milthorpe;
展开▼
作者单位

Res. Sch. of Comput. Sci. Australian Nat. Univ. Canberra ACT Australia;

Sch. of Sci. Technol. Univ. of New England Armidale NSW Australia;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Graphics processing units; Computational modeling; Memory management; Predictive models; Kernel; Performance evaluation; Parallel processing;

机译：图形处理单元；计算建模；内存管理;预测模型；核心;绩效评估；并行处理;

相似文献

外文文献
中文文献
专利

1. An OpenCL framework for high performance extraction of image features [J] . Douglas Coimbra de Andrade, Luís Gonzaga Trabasso Journal of Parallel and Distributed Computing . 2017,第NOVa期

机译：用于图像特征的高性能提取的OpenCL框架
2. Comparisons of Outcome Prediction Performance between Radiomics Features and Clinical Features Based on NRG Oncology/RTOG-0522 [J] . Zhong H., Athamnah M., Huang M., International Journal of Radiation Oncology, Biology, Physics . 2019,第1Suppla期

机译：基于NRG肿瘤学/ RTOG-0522基于NRG肿瘤的辐射族特征与临床特征的结果预测性能的比较
3. A joint feature selection framework for multivariate resource usage prediction in cloud servers using stability and prediction performance [J] . Gupta Shaifu, Dileep A. D., Gonsalves Timothy A. Journal of supercomputing . 2018,第11期

机译：使用稳定性和预测性能的云服务器中多变量资源使用情况预测的联合特征选择框架
4. OpenCL Performance Prediction using Architecture-Independent Features [C] . Beau Johnston, Gregory Falzon, Josh Milthorpe International Conference on High Performance Computing and Simulation . 2018

机译：使用架构无关的功能的OpenCL性能预测
5. Kaizen Programming with Enhanced Feature Discovery: An Automated Approach to Feature Selection and Feature Discovery for Prediction Models [D] . Stelmack, John. 2020

机译：Kaizen编程，具有增强功能发现：用于预测模型的特征选择和特征发现的自动方法
6. A comparison of performance of plant miRNA target prediction tools and the characterization of features for genome-wide target prediction [O] . Prashant K Srivastava, Taraka Ramji Moturu, Priyanka Pandey, 2014

机译：植物miRNA靶标预测工具的性能比较和全基因组靶标预测特征的表征
7. Multi-core programming with OpenCL: performance and portability: OpenCL in a memory bound scenario [O] . Fagerlund Olav Aanes 2010

机译：使用OpenCL进行多核编程：性能和可移植性：内存受限情况下的OpenCL

OpenCL Performance Prediction using Architecture-Independent Features

摘要

著录项

相似文献

相关主题

期刊订阅