...
首页> 外文期刊>Journal of supercomputing >A methodology correlating code optimizations with data memory accesses, execution time and energy consumption
【24h】

A methodology correlating code optimizations with data memory accesses, execution time and energy consumption

机译:一种将代码优化与数据存储器访问,执行时间和能耗相关联的方法

获取原文
获取原文并翻译 | 示例
           

摘要

The advent of data proliferation and electronic devices gets low execution time and energy consumption software in the spotlight. The key to optimizing software is the correct choice, order as well as parameters of optimization transformations that has remained an open problem in compilation research for decades for various reasons. First, most of the transformations are interdependent and thus addressing them separately is not effective. Second, it is very hard to couple the transformation parameters to the processor architecture (e.g., cache size) and algorithm characteristics (e.g., data reuse); therefore, compiler designers and researchers either do not take them into account at all or do it partly. Third, the exploration space, i.e., the set of all optimization configurations that have to be explored, is huge and thus searching is impractical. In this paper, the above problems are addressed for data-dominant affine loop kernels, delivering significant contributions. A novel methodology is presented reducing the exploration space of six code optimizations by many orders of magnitude. The objective can be execution time (ET), energy consumption (E) or the number of L1, L2 and main memory accesses. The exploration space is reduced in two phases: firstly, by applying a novel register blocking algorithm and a novel loop tiling algorithm and secondly, by computing the maximum and minimum ET/E values for each optimization set. The proposed methodology has been evaluated for both embedded and general-purpose CPUs and for seven well-known algorithms, achieving high memory access, speedup and energy consumption gain values (from 1.17 up to 40) over gcc compiler, hand-written optimized code and Polly. The exploration space from which the near-optimum parameters are selected is reduced from 17 up to 30 orders of magnitude.
机译:数据激增和电子设备的到来使得低执行时间和低能耗软件成为人们关注的焦点。优化软件的关键是优化转换的正确选择,顺序和参数,数十年来,由于各种原因,优化转换的参数一直是编译研究中的一个开放问题。首先,大多数转换是相互依赖的,因此单独处理它们是无效的。其次,很难将转换参数与处理器体系结构(例如,高速缓存大小)和算法特征(例如,数据重用)耦合在一起;因此,编译器设计人员和研究人员要么根本不考虑它们,要么部分不考虑它们。第三,探索空间,即必须探索的所有优化配置的集合很大,因此搜索是不切实际的。在本文中,上述问题针对以数据为主的仿射循环内核得以解决,并做出了重大贡献。提出了一种新颖的方法,将六个代码优化的探索空间减少了多个数量级。目标可以是执行时间(ET),能耗(E)或L1,L2和主存储器访问的数量。探索空间分为两个阶段:首先,通过应用新颖的寄存器阻塞算法和新颖的循环平铺算法,其次,通过计算每个优化集的最大和最小ET / E值。已针对嵌入式CPU和通用CPU以及七种著名算法对所提出的方法进行了评估,这些方法通过gcc编译器,手写优化代码和可实现高内存访问,加速和能耗增益值(从1.17到40)实现。波莉从中选择最佳参数的探索空间从17个减少到30个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号