Register pressure guided loop optimization.

机译：记录压力导向回路优化。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Digital Signal Processing(DSP) processors are a type of processor used for processing digital signals that are utilized in a very broad field. However, uncarefully designed loop optimizations implemented in an optimizing compiler for DSP processors cannot always deliver performance gain. Some reasons include causing too much register pressure or adding too much communication between register files to transfer values, called inter-cluster communication.;To control register pressure, predicting the register requirement before applying loop optimization can effectively prevent performance degradation. In this dissertation, we focus on two essential loop optimizations: scalar replacement and unroll-and-jam. We present two low cost register prediction methods for those loops in a high level representation with the consideration of other loop optimizations and general scalar optimizations before applying them. For unroll-and-jam, a performance model is also described to utilize prediction results to determine the unroll vector automatically from a given unroll space for achieving the best run-time performance.;Our prediction algorithm for scalar replacement predicts the floating-point register pressure of a loop within 2 registers and the integer register pressure within 2.7 registers on average with a time complexity of O( n2) in practice where n is the number of nodes in the data dependence graph used. This algorithm achieves similar performance to the best previous approach, having O(n 3) complexity. For the prediction algorithm for unroll-and-jam, our experiments show that it predicts the floating point register pressure within 3 registers and the integer register pressure within 4 registers. With this algorithm, for 92% of the test loops in our test suite, the performance model can pick the unroll vectors that achieve the best loop performance or performance close to the best. Also for the Polyhedron benchmark, our register pressure guided unroll-and-jam improves the overall performance about 2% over the model in the industry-leading optimizing Open64 backend on both 32bit and 64bit model for x86 and x86-64 architectures.;For inter-cluster communications, in this dissertation, a fusion algorithm is presented to consider the impacts from unroll-and-jam and scalar replacement and other optimizations for clustered VLIW architectures in order to provide the best overall performance as well as the minimum additional inter-cluster communications. In the experiments, this fusion algorithm applied with unroll-and-jam and scalar replacement speeds up all test loops from a factor of average 1.57 to 1.69, compared with the results by the similar optimizations but without fusion.;With the register pressure prediction algorithms and the demonstration of register pressure guided loop optimization, our research opens the door to completely eliminate the performance degradation of loop optimizations due to register pressure in the future. Loop fusion considering unroll-and-jam also helps a compiler to get better performance on a clustered VLIW architecture with a partitioned register bank.

机译：数字信号处理（DSP）处理器是一种处理器，用于处理在非常广泛的领域中使用的数字信号。但是，在针对DSP处理器的优化编译器中实施的设计不当的循环优化无法始终获得性能提升。某些原因包括导致过多的寄存器压力或在寄存器文件之间添加太多的通信以传递值，称为群集间通信。为了控制寄存器压力，在应用循环优化之前预测寄存器的需求可以有效防止性能下降。在本文中，我们主要关注两个基本的循环优化：标量替换和展开和阻塞。在应用它们之前，我们在考虑其他循环优化和常规标量优化的前提下，以高级表示形式为这些循环提供了两种低成本的寄存器预测方法。对于展开和阻塞，还描述了一种性能模型，该模型利用预测结果从给定的展开空间自动确定展开向量，以实现最佳的运行时性能。我们的标量替换预测算法可预测浮点寄存器在实践中，平均2个寄存器内的循环压力和2.7个寄存器内的整数寄存器压力的时间复杂度为O（n2），其中n是所使用的数据依赖图中的节点数。该算法具有O（n 3）复杂度，其性能与以前的最佳方法相似。对于展开和卡纸预测算法，我们的实验表明，该算法可预测3个寄存器内的浮点寄存器压力和4个寄存器内的整数寄存器压力。使用此算法，对于我们测试套件中92％的测试循环，性能模型可以选择实现最佳循环性能或接近最佳性能的展开向量。同样对于Polyhedron基准测试，我们的寄存器压力引导下的展开和卡纸性能比行业领先的针对x86和x86-64体系结构的32位和64位模型上优化的Open64后端模型的整体性能提高了约2％。集群通信，本文提出了一种融合算法，以考虑展开干扰和标量替换以及对集群VLIW体系结构进行的其他优化的影响，以便提供最佳的总体性能以及最少的附加集群间通讯。在实验中，与类似优化但没有融合的结果相比，该融合算法结合了展开干扰和标量替换，可将所有测试循环的速度从平均1.57倍提高到1.69倍。并演示了套准压力引导的回路优化，我们的研究为彻底消除将来由于套准压力而导致的回路优化性能下降打开了大门。考虑到展开和阻塞的循环融合还有助于编译器在具有分区寄存器组的群集VLIW体系结构上获得更好的性能。

著录项

作者
Ma, Yin.;
展开▼
作者单位

Michigan Technological University.;

展开▼
授予单位 Michigan Technological University.;
学科 Computer science.
学位 Ph.D.
年度 2007
页码 164 p.
总页数 164
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Digital Self-Tuning PID Control of Pressure Plant with Closed-Loop Optimization. Information Technology and Control [J] . Liau?ius Gediminas, Kaminskas Vytautas, Liutkevi?ius Raimundas Engineering Economics . 2011,第3期

机译：闭环优化的压力设备数字自调节PID控制。信息技术与控制
2. Research of Register Pressure Aware Loop Unrolling Optimizations for Compiler [J] . Xuehua Liu, Liping Ding, Yanfeng Li, MATEC Web of Conferences . 2018,第1期

机译：编译器的套准压力感知循环展开优化研究
3. Comparison of three different techniques of endotracheal tube cuff inflation: just seal, stethoscope guided and pressure volume loop: a prospective randomized study [J] . Mamta Bhardwaj, Kiranpreet Kaur, Asha Sharma, International Journal of Research in Medical Sciences . 2020,第2期

机译：三种不同技术的气管管袖带通胀三种不同技术：只需密封，听诊器导向和压力量环：一项潜在随机研究
4. Register Pressure in Software-Pipelined Loop Nests: Fast Computation and Impact on Architecture Design [C] . Alban Douillet, Guang R. Gao International workshop on languages and compilers for parallel computing . 2006

机译：在软件 - 流水线环巢中注册压力：快速计算和对架构设计的影响
5. Snowmobiling in Maine: Economic Contributions and Registered Maine Guides [D] . Hathaway, Ian. 2020

机译：缅因州的雪地摩托：经济贡献和注册缅因州指南
6. Closed-Loop Control Better than Open-Loop Control of Profofol TCI Guided by BIS: A Randomized Controlled Multicenter Clinical Trial to Evaluate the CONCERT-CL Closed-Loop System [O] . Yu Liu, Min Li, Dong Yang, -1

机译：由BIS指导的Profofol TCI的闭环控制优于开环控制：评估CONCERT-CL闭环系统的随机受控多中心临床试验
7. Effects of Loop Unrolling and Loop Fusion on Register Pressure and Code Performance. [O] . Dale Shires July, Dale Shires, Intentionally Left Blank 2007

机译：环路展开和环路融合对套准压力和编码性能的影响。
8. Effects of Loop Unrolling and Loop Fusion on Register Pressure and CodePerformance [R] . Shires, D. 1997

机译：环路展开和环路融合对套准压力和代码性能的影响

Register pressure guided loop optimization.

摘要

著录项

相似文献

相关主题

期刊订阅