Pro++: A Profiling Framework for Primitive-Based GPU Programming

Nicola Bombieri; Federico Busato; Franco Fummi

首页> 外文期刊>Emerging Topics in Computing, IEEE Transactions on >Pro++: A Profiling Framework for Primitive-Based GPU Programming

【24h】

Pro++: A Profiling Framework for Primitive-Based GPU Programming

机译：Pro ++：基于基元的GPU编程的性能分析框架

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Parallelizing software applications through the use of existing optimized primitives is a common trend that mediates the complexity of manual parallelization and the use of less efficient directive-based programming models. Parallel primitive libraries allow software engineers to map any sequential code to a target many-core architecture by identifying the most computational intensive code sections and mapping them into one or more existing primitives. On the other hand, the spreading of such a primitive-based programming model and the different graphic processing unit (GPU) architectures has led to a large and increasing number of third-party libraries, which often provide different implementations of the same primitive, each one optimized for a specific architecture. From the developer point of view, this moves the actual problem of parallelizing the software application to selecting, among the several implementations, the most efficient primitives for the target platform. This paper presents Pro++, a profiling framework for GPU primitives that allows measuring the implementation quality of a given primitive by considering the target architecture characteristics. The framework collects the information provided by a standard GPU profiler and combines them into optimization criteria. The criteria evaluations are weighed to distinguish the impact of each optimization on the overall quality of the primitive implementation. This paper shows how the tuning of the different weights has been conducted through the analysis of five of the most widespread existing primitive libraries and how the framework has been eventually applied to improve the implementation performance of two standard and widespread primitives.

机译：通过使用现有的优化原语来并行化软件应用程序是一种普遍的趋势，它介导了手动并行化的复杂性以及效率较低的基于指令的编程模型的使用。并行基元库允许软件工程师通过识别计算量最大的代码段并将它们映射到一个或多个现有基元中，从而将任何顺序代码映射到目标多核体系结构。另一方面，这种基于基元的编程模型和不同的图形处理单元（GPU）架构的传播导致了越来越多的第三方库，这些库通常提供同一基元的不同实现。针对特定架构进行了优化的一种。从开发人员的角度来看，这将使软件应用程序并行化的实际问题转移到在几种实现中选择目标平台最有效的原语。本文介绍了Pro ++，它是用于GPU原语的性能分析框架，允许通过考虑目标体系结构特征来测量给定原语的实现质量。该框架收集标准GPU探查器提供的信息，并将其组合为优化标准。权衡标准评估以区分每个优化对原始实现的整体质量的影响。本文展示了如何通过对五个最广泛使用的现有原始库进行分析来进行不同权重的调整，以及如何最终将该框架应用于改善两个标准且广泛使用的原始库的实现性能。

著录项

来源
《Emerging Topics in Computing, IEEE Transactions on》 |2018年第3期|382-394|共13页
作者
Nicola Bombieri; Federico Busato; Franco Fummi;
展开▼
作者单位

Department of Computer Science, University of Verona, Verona, Italy;

Department of Computer Science, University of Verona, Verona, Italy;

Department of Computer Science, University of Verona, Verona, Italy;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Graphics processing units; Computational modeling; Instruction sets; Computer architecture; Optimization; Measurement; Libraries;

机译：图形处理单元;计算建模;指令集;计算机体系结构;优化;测量;库;

相似文献

外文文献
中文文献
专利

1. GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems [J] . Fumihiko INO, Shinta NAKAGAWA, Kenichi HAGIHARA IEICE transactions on information and systems . 2013,第12期

机译：GPU-Chariot：用于在多GPU系统上运行的流应用程序的编程框架
2. GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems [J] . Fumihiko INO, Shinta NAKAGAWA, Kenichi HAGIHARA IEICE Transactions on Information and Systems . 2013,第12期

机译：GPU-Chariot：用于在多GPU系统上运行的流应用程序的编程框架
3. Program-based dynamic precision selection framework with a dual-mode unified shader for mobile GPUs [J] . Slo-Li Chu, Chih-Chieh Hsiao, Chen-Yu Chen Computers and Electrical Engineering . 2013,第7期

机译：基于程序的动态精度选择框架，带有用于移动GPU的双模式统一着色器
4. A Dynamic Programming-Based MCMC Framework for Solving DCOPs with GPUs [C] . Ferdinando Fioretto, William Yeoh, Enrico Pontelli International conference on principles and practice of constraint programming . 2016

机译：基于动态编程的MCMC框架，用于使用GPU解决DCOP
5. Framework for parallelization of programs on GPUs. [D] . P. Kumar, Raghu Raj. 2016

机译：GPU上程序并行化的框架。
6. Real time stereographic rendering and display of medical images with programmable GPUs [O] . Xiao Hui Wang, Walter F. Good -1

机译：使用可编程GPU进行实时立体渲染和医学图像显示
7. Dynamic Task Scheduling Scheme for a GPGPU Programming Framework [O] . Kazuhiko Ohno, Rei Yamamoto, Hiroaki Tanaka 2016

机译：GPGPU编程框架的动态任务调度方案

Pro++: A Profiling Framework for Primitive-Based GPU Programming

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅