Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies

Sylvain Girbal; Nicolas Vasilache; Cedric Bastoul; Albert Cohen; David Parello; Marc Sigler; Olivier Temam

首页> 外文期刊>International journal of parallel programming >Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies

【24h】

Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies

机译：用于深度并行和内存层次结构的循环变换的半自动组合

获取原文

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Modern compilers are responsible for translating the idealistic operational semantics of the source program into a form that makes efficient use of a highly complex heterogeneous machine. Since optimization problems are associated with huge and unstructured search spaces, this combinational task is poorly achieved in general, resulting in weak scalability and disappointing sustained performance. We address this challenge by working on the program representation itself, using a semi-automatic optimization approach to demonstrate that current compilers offen suffer from unnecessary constraints and intricacies that can be avoided in a semantically richer transformation framework. Technically, the purpose of this paper is threefold: to show that syntactic code representations close to the operational semantics lead to rigid phase ordering and cumbersome expression of architecture-aware loop transformations, to illustrate how complex transformation sequences may be needed to achieve significant performance benefits, to facilitate the automatic search for program transformation sequences, improving on classical polyhedral representations to better support operation research strategies in a simpler, structured search space. The proposed framework relies on a unified polyhedral representation of loops and statements, using normalization rules to allow flexible and expressive transformation sequencing. This representation allows to extend the scalability of polyhedral dependence analysis, and to delay the (automatic) legality checks until the end of a transformation sequence. Our work leverages on algorithmic advances in polyhedral code generation and has been implemented in a modern research compiler.

机译：现代的编译器负责将源程序的理想操作语义转换为有效利用高度复杂的异构机器的形式。由于优化问题与庞大的非结构化搜索空间相关联，因此通常很难实现此组合任务，从而导致可伸缩性较弱，并且持续性能令人失望。我们通过使用半自动优化方法来研究程序表示本身，来解决这一挑战，以证明当前的编译器冒犯了不必要的约束和复杂性，而这些约束和复杂性可以在语义丰富的转换框架中避免。从技术上讲，本文的目的是三方面的：表明接近于操作语义的句法代码表示会导致僵化的相位排序和体系结构感知循环转换的繁琐表达，以说明如何可能需要复杂的转换序列来实现显着的性能优势，以便于自动搜索程序转换序列，改进了经典的多面体表示形式，以便在更简单，结构化的搜索空间中更好地支持运筹学策略。所提出的框架依赖于循环和语句的统一多面体表示，使用归一化规则以允许灵活且富有表现力的转换排序。这种表示方式可以扩展多面体依赖性分析的可伸缩性，并可以将（自动）合法性检查延迟到转换序列结束之前。我们的工作利用了多面体代码生成中的算法进步，并已在现代研究编译器中实现。

著录项

来源
《International journal of parallel programming》 |2006年第3期|p.261-317|共57页
作者
Sylvain Girbal; Nicolas Vasilache; Cedric Bastoul; Albert Cohen; David Parello; Marc Sigler; Olivier Temam;
展开▼
作者单位

ALCHEMY Group, INRIA Futurs and LRI, Paris-Sud 11 University, France;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
compiler optimization; semi-automatic program transformation; polyhedral model; automatic parallelization;

机译：编译器优化;半自动程序转换;多面体模型;自动并行化;

相似文献

外文文献
中文文献
专利

1. Incremental Hierarchical Memory Size Estimation for Steering of Loop Transformations [J] . Q. HU, P. G. KJELDSBERG, A. VANDECAPPELLE, ACM Transactions on Design Automation of Electronic Systems . 2007,第4期

机译：用于循环变换的增量分层内存大小估计
2. Transformations Techniques for extracting Parallelism in Non-Uniform Nested Loops [J] . FAWZY A. TORKEY, AFAF A. SALAH, NAKED M. EL DESOUKY, WSEAS Transactions on Computers . 2008,第7a9期

机译：非均匀嵌套循环中提取并行性的转换技术
3. Code Transformations to Improve Memory Parallelism [J] . Vijay S. Pai, Sarita Adve Journal of instruction-level parallelism . 2000,第1期

机译：代码转换以改善内存并行性
4. Semi-automatic Composition of Data Layout Transformations for Loop Vectorization [C] . Shixiong Xu, David Gregg IFIP international conference on network and parallel computing . 2014

机译：用于循环矢量化的数据布局转换的半自动组合
5. Denovo: Rethinking the memory hierarchy for disciplined parallelism [D] . Sung, Hyojin 2015

机译：Denovo：重新思考内存层次结构以规范并行性
6. Deep Recurrent Neural Network Reveals a Hierarchy of Process Memory during Dynamic Natural Vision [O] . Junxing Shi, Haiguang Wen, Yizhen Zhang, 2018

机译：深度递归神经网络揭示了动态自然视觉过程记忆的层次结构
7. Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies [O] . Sylvain Girbal, Nicolas Vasilache, Cédric Bastoul, 2006

机译：半自动合成循环转换以实现深度并行性和内存层次结构
8. New Loop Transformation Techniques for Massive Parallelism [R] . Lu, L. C., Chen, M. 1990

机译：大规模并行的新循环变换技术

Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅