...
首页> 外文期刊>ACM Transactions on Embedded Computing Systems >A Constraint Programming Approach for Integrated Spatial and Temporal Scheduling for Clustered Architectures
【24h】

A Constraint Programming Approach for Integrated Spatial and Temporal Scheduling for Clustered Architectures

机译:集群架构时空集成调度的约束规划方法

获取原文
获取原文并翻译 | 示例
           

摘要

Many embedded processors use clustering to scale up instruction-level parallelism in a cost-effective manner. In a clustered architecture, the registers and functional units are partitioned into smaller units and clusters communicate through register-to-register copy operations. Texas Instruments, for example, has a series of architectures for embedded processors which are clustered. Such an architecture places a heavier burden on the compiler, which must now assign instructions to clusters (spatial scheduling), assign instructions to cycles (temporal scheduling), and schedule copy operations to move data between clusters. We consider instruction scheduling of local blocks of code on clustered architectures to improve performance. Scheduling for space and time is known to be a hard problem. Previous work has proposed greedy approaches based on list scheduling to simultaneously perform spatial and temporal scheduling and phased approaches based on first partitioning a block of code to do spatial assignment and then performing temporal scheduling. Greedy approaches risk making mistakes that are then costly to recover from, and partitioning approaches suffer from the well-known phase ordering problem. In this article, we present a constraint programming approach for scheduling instructions on clustered architectures. We employ a problem decomposition technique that solves spatial and temporal scheduling in an integrated manner. We analyze the effect of different hardware parameters - such as the number of clusters, issue-width, and intercluster communication cost - on application performance. We found that our approach was able to achieve an improvement of up to 26%, on average, over a state-of-the-art technique on superblocks from SPEC 2000 benchmarks.
机译:许多嵌入式处理器使用群集以经济高效的方式扩展指令级并行性。在集群体系结构中,寄存器和功能单元被划分为较小的单元,并且集群通过寄存器到寄存器的复制操作进行通信。例如,德州仪器(TI)具有用于群集的嵌入式处理器的一系列体系结构。这种体系结构给编译器带来了沉重的负担,编译器现在必须将指令分配给集群(空间调度),将指令分配给周期(时间调度)以及调度复制操作以在集群之间移动数据。我们考虑在集群体系结构上对本地代码块进行指令调度,以提高性能。安排时间和空间安排是一个难题。先前的工作提出了基于列表调度的贪婪方法以同时执行空间和时间调度,以及基于首先对代码块进行分区以进行空间分配然后执行时间调度的分阶段方法。贪婪的方法会冒犯错误的风险,然后要从这些错误中恢复成本很高,而分区方法会遭受众所周知的相序问题。在本文中,我们提出了一种约束编程方法,用于在集群体系结构上调度指令。我们采用问题分解技术,以一种集成的方式解决空间和时间调度问题。我们分析了不同的硬件参数(例如群集数,问题宽度和群集间通信成本)对应用程序性能的影响。我们发现,与SPEC 2000基准测试中的超级块的最新技术相比,我们的方法平均可实现高达26%的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号