首页> 外文期刊>ACM Transactions on Architecture and Code Optimization >Preallocation Instruction Scheduling with Register Pressure Minimization Using a Combinatorial Optimization Approach
【24h】

Preallocation Instruction Scheduling with Register Pressure Minimization Using a Combinatorial Optimization Approach

机译:使用组合优化方法的带寄存器压力最小化的预分配指令调度

获取原文
获取原文并翻译 | 示例
           

摘要

Balancing Instruction-Level Parallelism (ILP) and register pressure during preallocation instruction scheduling is a fundamentally important problem in code generation and optimization. The problem is known to be NP-complete. Many heuristic techniques have been proposed to solve this problem. However, due to the inherently conflicting requirements of maximizing ILP and minimizing register pressure, heuristic techniques may produce poor schedules in many cases. If such cases occur in hot code, significant performance degradation may result. A few combinatorial optimization approaches have also been proposed, but none of them has been shown to solve large real-world instances within reasonable time. This article presents the first combinatorial algorithm that is efficient enough to optimally solve large instances of this problem (basic blocks with hundreds of instructions) within a few seconds per instance. The proposed algorithm uses branch-and-bound enumeration with a number of powerful pruning techniques to efficiently search the solution space. The search is based on a cost function that incorporates schedule length and register pressure. An implementation of the proposed scheduling algorithm has been integrated into the LLVM Compiler and evaluated using SPEC CPU 2006. On x86-64, with a time limit of 10ms per instruction, it optimally schedules 79% of the hot basic blocks in FP2006. Another 19% of the blocks are not optimally scheduled but are improved in cost relative to LLVM's heuristic. This improves the execution time of some benchmarks by up to 21%, with a geometric-mean improvement of 2.4% across the entire benchmark suite. With the use of precise latency information, the geometric-mean improvement is increased to 2.8%.
机译:在预分配指令调度期间平衡指令级并行性(ILP)和寄存器压力是代码生成和优化中的一个根本性重要问题。已知该问题是NP完全的。已经提出了许多启发式技术来解决该问题。但是,由于最大化ILP和最小化寄存器压力的内在矛盾要求,在许多情况下,启发式技术可能会产生不良的计划。如果此类情况出现在热代码中,则可能会导致性能显着下降。还提出了一些组合优化方法,但没有一个方法能在合理的时间内解决大型现实实例。本文介绍了第一个组合算法,该算法足够有效,可以在每个实例的几秒钟内以最佳方式解决此问题的大型实例(具有数百条指令的基本块)。所提出的算法使用分支定界枚举和许多强大的修剪技术来有效地搜索解空间。该搜索基于包含时间表长度和注册压力的成本函数。所建议的调度算法的实现已集成到LLVM编译器中,并使用SPEC CPU 2006进行了评估。在x86-64上,每条指令的时间限制为10ms,它可以最佳地调度FP2006中79%的热基本块。与LLVM的启发式方法相比,另外19%的块未进行最佳调度,但成本有所改善。这将某些基准的执行时间缩短了多达21%,在整个基准套件中的几何平均改进了2.4%。通过使用精确的等待时间信息,几何平均改进提高到2.8%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号