OpenCL Performance Prediction using Architecture-Independent Features

机译：使用架构无关的功能的OpenCL性能预测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

OpenCL is an attractive programming model for heterogeneous high-performance computing systems, with wide support from hardware vendors and significant performance portability. To support efficient scheduling on HPC systems it is necessary to perform accurate performance predictions for OpenCL workloads on varied compute devices, which is challenging due to diverse computation, communication and memory access characteristics which result in varying performance between devices. The Architecture Independent Workload Characterization (AIWC) tool can be used to characterize OpenCL kernels according to a set of architecture-independent features. This work presents a methodology where AIWC features are used to form a model capable of predicting accelerator execution times. We used this methodology to predict execution times for a set of 37 computational kernels running on 15 different devices representing a broad range of CPU, GPU and MIC architectures. The predictions are highly accurate, differing from the measured experimental run-times by an average of only 1.2%, and correspond to actual execution time mispredictions of 9 ps to 1 sec according to problem size. A previously unencountered code can be instrumented once and the AIWC metrics embedded in the kernel, to allow performance prediction across the full range of modelled devices. The results suggest that this methodology supports correct selection of the most appropriate device for a previously unen- countered code, which is highly relevant to the HPC scheduling setting.

机译：OpenCL是一个有吸引力的非均质高性能计算系统的编程模型，包括硬件供应商的广泛支持以及显着的性能便携性。为了支持高效调度HPC系统，必须对不同计算设备上的OpenCL工作负载进行准确的性能预测，这是由于设备之间具有不同性能的不同计算，通信和存储器访问特性而具有具有挑战性的。体系结构独立的工作负载表征（AIWC）工具可用于根据一组独立的功能来表征OpenCL内核。该工作呈现了一种方法，其中AIWC功能用于形成能够预测加速器执行时间的模型。我们使用该方法来预测运行的15个不同设备上的一组37个计算内核的执行时间，代表广泛的CPU，GPU和麦克风架构。预测高度准确，与测量的实验运行相差，平均仅为1.2％，并且根据问题大小对应于9 ps至1 sec的实际执行时间错误。先前未经识别的代码可以介绍一次，并且嵌入在内核中的AIWC度量标准，以允许跨所有建模设备的性能预测。结果表明，该方法支持正确选择最合适的设备，以获得先前未置换的代码，这与HPC调度设置非常相关。

著录项

来源
《International Conference on High Performance Computing and Simulation》|2018年|522p|共9页
会议地点
作者
Beau Johnston; Gregory Falzon; Josh Milthorpe;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP30-53;
关键词
Graphics processing units; Computational modeling; Memory management; Predictive models; Kernel; Performance evaluation; Parallel processing;

机译：图形处理单元;计算建模;内存管理;预测模型;内核;性能评估;并行处理;

相似文献

外文文献
中文文献
专利

1. An OpenCL framework for high performance extraction of image features [J] . Douglas Coimbra de Andrade, Luís Gonzaga Trabasso Journal of Parallel and Distributed Computing . 2017,第NOVa期

机译：用于图像特征的高性能提取的OpenCL框架
2. Comparisons of Outcome Prediction Performance between Radiomics Features and Clinical Features Based on NRG Oncology/RTOG-0522 [J] . Zhong H., Athamnah M., Huang M., International Journal of Radiation Oncology, Biology, Physics . 2019,第1Suppla期

机译：基于NRG肿瘤学/ RTOG-0522基于NRG肿瘤的辐射族特征与临床特征的结果预测性能的比较
3. A joint feature selection framework for multivariate resource usage prediction in cloud servers using stability and prediction performance [J] . Gupta Shaifu, Dileep A. D., Gonsalves Timothy A. Journal of supercomputing . 2018,第11期

机译：使用稳定性和预测性能的云服务器中多变量资源使用情况预测的联合特征选择框架
4. OpenCL Performance Prediction using Architecture-Independent Features [C] . Beau Johnston, Gregory Falzon, Josh Milthorpe International Conference on High Performance Computing Simulation . 2018

机译：使用与体系结构无关的功能进行OpenCL性能预测
5. Kaizen Programming with Enhanced Feature Discovery: An Automated Approach to Feature Selection and Feature Discovery for Prediction Models [D] . Stelmack, John. 2020

机译：Kaizen编程，具有增强功能发现：用于预测模型的特征选择和特征发现的自动方法
6. A comparison of performance of plant miRNA target prediction tools and the characterization of features for genome-wide target prediction [O] . Prashant K Srivastava, Taraka Ramji Moturu, Priyanka Pandey, 2014

机译：植物miRNA靶标预测工具的性能比较和全基因组靶标预测特征的表征
7. Multi-core programming with OpenCL: performance and portability: OpenCL in a memory bound scenario [O] . Fagerlund Olav Aanes 2010

机译：使用OpenCL进行多核编程：性能和可移植性：内存受限情况下的OpenCL

OpenCL Performance Prediction using Architecture-Independent Features

摘要

著录项

相似文献

相关主题

期刊订阅