Autotuning of configuration for program execution in GPUs

首页> 外文期刊>Concurrency, practice and experience >Autotuning of configuration for program execution in GPUs

【24h】

Autotuning of configuration for program execution in GPUs

机译：自动调整配置以在GPU中执行程序

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Graphics Processing Units (GPUs) are used as accelerators for improving performance while executing highly data parallel applications. The GPUs are characterized by a number of Streaming Multiprocessors (SM) and a large number of cores within each SM. In addition to this, a hierarchy of memories with different latencies and sizes is present in the GPUs. The program execution in GPUs is thus dependent on a number of parameter values, both at compile time and runtime. To obtain the optimal performance with these GPU resources, a large parameter space is to be explored, and this leads to a number of unproductive program executions. To alleviate this difficulty, machine learning-based autotuning systems are proposed to predict the right configuration using a limited set of compile-time parameters. In this paper, we propose a two-stage machine learning-based autotuning framework using an expanded set of attributes. The important parameters such as block size, occupancy, eligible warps, and execution time are predicted. The mean relative error in prediction of different parameters ranges from of 16% to 6.5%. Dimensionality reduction for the features set reduces the features by up to 50% with further increase in prediction accuracy.

机译：图形处理单元（GPU）用作加速器，可在执行高数据并行应用程序时提高性能。 GPU的特点是具有多个流式多处理器（SM）和每个SM中的大量内核。除此之外，GPU中还存在具有不同延迟和大小的存储器层次结构。因此，GPU中的程序执行取决于编译时和运行时的许多参数值。为了利用这些GPU资源获得最佳性能，需要探索较大的参数空间，这会导致许多无效的程序执行。为了减轻这一困难，提出了一种基于机器学习的自动调整系统，以使用一组有限的编译时参数来预测正确的配置。在本文中，我们提出了一个使用扩展属性集的两阶段基于机器学习的自动调整框架。可以预测重要的参数，例如块大小，占用率，合格的翘曲和执行时间。预测不同参数时的平均相对误差为16％至6.5％。特征集的降维可将特征减少多达50％，并进一步提高预测精度。

著录项

来源
《Concurrency, practice and experience》 |2020年第9期|e5635.1-e5635.12|共12页
作者

展开▼
作者单位

Anna Univ Dept Comp Technol Chennai 600044 Tamil Nadu India;

Anna Univ Dept Informat Sci & Technol Chennai Tamil Nadu India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
autotuning; block size; GPU parameter space; profiler metrics; regression tree;

机译：自动调节;块大小;GPU参数空间;分析器指标;回归树;

相似文献

外文文献
中文文献
专利

1. Hybrid CPU-GPU execution support in the skeleton programming framework SkePU [J] . Ohberg Tomas, Ernstsson August, Kessler Christoph Journal of supercomputing . 2020,第7期

机译：Hybrid CPU-GPU在骨架编程框架SKEPU中的执行支持
2. Bulk execution of the dynamic programming for the optimal polygon triangulation problem on the GPU [J] . Kohei Yamashita, Yasuaki Ito, Koji Nakano Concurrency, practice and experience . 2019,第19期

机译：在GPU上批量执行动态编程以实现最佳多边形三角剖分问题
3. Design Flow for GPU and Multicore Execution of Dynamic Dataflow Programs [J] . J. Boutellier, T. Nyländen Journal of signal processing systems for signal, image, and video technology . 2017,第3期

机译：GPU和动态数据流程序的多核执行的设计流程
4. Speculative Execution of Parallel Programs with Precise Exception Semantics on GPUs [C] . Akihiro Hayashi, Max Grossman, Jisheng Zhao, International workshop on languages and compilers for parallel computing . 2014

机译：在GPU上具有精确异常语义的并行程序的推测执行
5. Autotuning, code generation and optimizing compiler technology for gpus. [D] . Khan, Malik Muhammad Zaki Murtaza. 2012

机译：自动调整，代码生成并优化GPU的编译器技术。
6. Application Performance Analysis and Efficient Execution on Systems with multi-core CPUs GPUs and MICs: A Case Study with Microscopy Image Analysis [O] . George Teodoro, Tahsin Kurc, Guilherme Andrade, -1

机译：具有多核CPUGPU和MIC的系统上的应用程序性能分析和高效执行：以显微镜图像分析为例
7. Software Pipelined Execution of Stream Programs on GPUs [O] . Abhishek Udupa, R. Govindarajan, Matthew J. Thazhuthaveetil 2009

机译：GpU上流程序的软件流水线执行

Autotuning of configuration for program execution in GPUs

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅