首页> 外文OA文献 >Transformations de programme automatiques et source-à-source pour accélérateurs matériels de type GPU

【2h】

Transformations de programme automatiques et source-à-source pour accélérateurs matériels de type GPU

机译：GPU类型的硬件加速器的自动和源到源程序转换

页面导航

摘要
著录项
相似文献
相关主题

摘要

Since the beginning of the 2000s, the raw performance of processors stopped its exponential increase. The modern graphic processing units (GPUs) have been designed as array of hundreds or thousands of compute units. The GPUs' compute capacity quickly leads them to be diverted from their original target to be used as accelerators for general purpose computation. However programming a GPU efficiently to perform other computations than 3D rendering remains challenging.The current jungle in the hardware ecosystem is mirrored by the software world, with more and more programming models, new languages, different APIs, etc. But no one-fits-all solution has emerged.This thesis proposes a compiler-based solution to partially answer the three "P" properties: Performance, Portability, and Programmability. The goal is to transform automatically a sequential program into an equivalent program accelerated with a GPU. A prototype, Par4All, is implemented and validated with numerous experiences. The programmability and portability are enforced by definition, and the performance may not be as good as what can be obtained by an expert programmer, but still has been measured excellent for a wide range of kernels and applications.A survey of the GPU architectures and the trends in the languages and framework design is presented. The data movement between the host and the accelerator is managed without involving the developer. An algorithm is proposed to optimize the communication by sending data to the GPU as early as possible and keeping them on the GPU as long as they are not required by the host. Loop transformations techniques for kernel code generation are involved, and even well-known ones have to be adapted to match specific GPU constraints. They are combined in a coherent and flexible way and dynamically scheduled within the compilation process of an interprocedural compiler. Some preliminary work is presented about the extension of the approach toward multiple GPUs.

机译：自2000年代初以来，处理器的原始性能就停止了其指数级增长。现代图形处理单元（GPU）已设计为成百上千个计算单元的阵列。 GPU的计算能力迅速导致它们偏离其原始目标，用作通用计算的加速器。但是，有效地对GPU进行编程以执行除3D渲染以外的其他计算仍具有挑战性。硬件生态系统中的当前丛林已被软件世界所反映，具有越来越多的编程模型，新语言，不同的API等。本文提出了一种基于编译器的解决方案，部分解决了三个“ P”属性：性能，可移植性和可编程性。目标是将顺序程序自动转换为使用GPU加速的等效程序。 Par4All原型的实施和验证具有许多经验。可编程性和可移植性由定义来强制执行，其性能可能不如专业程序员所能获得的那样好，但是在各种内核和应用程序中仍被认为具有出色的性能。介绍了语言和框架设计的趋势。主机和加速器之间的数据移动是在不涉及开发人员的情况下进行管理的。提出了一种算法，可通过尽早将数据发送到GPU并在主机不需要的时候将它们保留在GPU上来优化通信。涉及用于内核代码生成的循环转换技术，即使是众所周知的循环转换技术也必须进行调整以匹配特定的GPU约束。它们以一致且灵活的方式组合在一起，并在过程间编译器的编译过程中动态调度。提出了一些有关将该方法扩展到多个GPU的初步工作。

著录项

作者
Amini Mehdi;
展开▼
作者单位

展开▼
年度 2012
总页数
原文格式 PDF
正文语种 en
中图分类

相似文献

外文文献
中文文献
专利

1. OUTILS DE DÉVELOPPEMENT DES CARTES PROCESSEURS i.MX6 CRÉÉES PAR EOLANE POUR ACCÉLÉRER LA MISE AU POINT DE MATÉRIELS [J] . Electronique . 2014,第51期

机译：用环氧乙烷开发的i.MX6处理器卡开发工具，可加速材料开发
2. Un véritable show-room pour vendre du matériel d'élevage: UNION SET mise sur un catalogue papier et un magasin d'exposition pour développer ses ventes de matériel d'élevage et s'adapter à sa nouvelle logistique [J] . Agro-Distribution . 2006,第157期

机译：一个真正的销售繁殖设备的“展示厅”：UNION SET依靠纸质目录和一家展览店来发展其繁殖设备的销售并适应新的物流
3. Conception d'un composant matériel réutilisable flexible pour la transformation en ondelettes 2D [J] . G. Savaton, E. Casseau, E. Martin, Traitement du Signal; Signal-Image-Parole . 2004,第2期

机译：灵活的可重用硬件组件的设计，可转换为2D小波
4. Irradiation à 77 K de diodes de puissance pour LHC dans le champ d'un accélérateur à haute énergie [C] . V. Beriand, A. Gharib, D. Hagedorn, Third European conference on radiation and its effects on components and systems . 1995

机译：高能加速器领域中LHC用功率二极管的77 K辐射
5. ACRE: Un générateur automatique d'aspect pour tester des logiciels écrits en C++ [D] . Duclos, Etienne 2012

机译：ACRE：自动方面生成器，用于测试用C ++编写的软件
6. Matériel amélioré pour lisolement des Brucella et des Salmonella par hémoculture [O] . M. Ruiz Castañeda 1956

机译：通过血液培养分离布鲁氏菌和沙门氏菌的改良材料
7. Ordonnancement d'applications à flux de données pour les MPSoC embarqués hybrides comprenant des unités de calcul programmables et des accélérateurs matériels [O] . Arras Paul-Antoine 2015

机译：用于混合嵌入式MPSoC的数据流应用程序调度，包括可编程计算单元和硬件加速器

Transformations de programme automatiques et source-à-source pour accélérateurs matériels de type GPU

摘要

著录项

相似文献

相关主题

期刊订阅