首页> 外文期刊>Computing >A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details
【24h】

A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details

机译:一种方法,将六个编译器转换的搜索空间作为一个问题解决,并利用硬件体系结构细节来修剪六种编译器转换的搜索空间

获取原文
获取原文并翻译 | 示例
       

摘要

Today's compilers have a plethora of optimizations-transformations to choose from, and the correct choice, order as well parameters of transformations have a significant/large impact on performance; choosing the correct order and parameters of optimizations has been a long standing problem in compilation research, which until now remains unsolved; the separate sub-problems optimization gives a different schedule/binary for each sub-problem and these schedules cannot coexist, as by refining one degrades the other. Researchers try to solve this problem by using iterative compilation techniques but the search space is so big that it cannot be searched even by using modern supercomputers. Moreover, compiler transformations do not take into account the hardware architecture details and data reuse in an efficient way. In this paper, a new iterative compilation methodology is presented which reduces the search space of six compiler transformations by addressing the above problems; the search space is reduced by many orders of magnitude and thus an efficient solution is now capable to be found. The transformations are the following: loop tiling (including the number of the levels of tiling), loop unroll, register allocation, scalar replacement, loop interchange and data array layouts. The search space is reduced (a) by addressing the aforementioned transformations together as one problem and not separately, (b) by taking into account the custom hardware architecture details (e.g., cache size and associativity) and algorithm characteristics (e.g., data reuse). The proposed methodology has been evaluated over iterative compilation and gcc/icc compilers, on both embedded and general purpose processors; it achieves significant performance gains at many orders of magnitude lower compilation time.
机译:当今的编译器具有大量的优化-转换可供选择,正确的选择,转换顺序和转换参数对性能有重大影响。选择正确的优化顺序和参数一直是编译研究中长期存在的问题,直到现在仍未解决。单独的子问题优化为每个子问题提供了不同的调度/二进制,并且这些调度无法共存,因为通过细化一个会降低另一个。研究人员试图通过使用迭代编译技术来解决此问题,但是搜索空间很大,以至于即使使用现代超级计算机也无法搜索到。此外,编译器转换未以有效方式考虑硬件架构细节和数据重用。本文提出了一种新的迭代编译方法,通过解决上述问题来减少六个编译器转换的搜索空间。搜索空间减少了多个数量级,因此现在可以找到有效的解决方案。转换如下:循环切片(包括切片级别的数量),循环展开,寄存器分配,标量替换,循环互换和数据数组布局。 (a)通过将上述转换作为一个问题而不是单独解决来减少搜索空间;(b)通过考虑自定义硬件体系结构细节(例如,缓存大小和关联性)和算法特征(例如,数据重用)来减少搜索空间。在嵌入式和通用处理器上,通过迭代编译和gcc / icc编译器对提出的方法进行了评估。它在缩短编译时间很多数量级的情况下获得了显着的性能提升。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号