首页> 外文会议>Proceedings of the 1995 ACM/IEEE supercomputing conference >Compiling and Optimizing for Decoupled Architectures
【24h】

Compiling and Optimizing for Decoupled Architectures

机译:解耦架构的编译和优化

获取原文

摘要

Decoupled architectures provide a key to the problem of sustained supercomputer performance through their ability to hide large memory latencies. When a program executes in a decoupled mode the perceived memory latency at the processor is zero; effectively the entire physical memory has an access time equivalent to the processor's register file, and latency is completely hidden. However, the asynchronous functional units within a decoupled architecture must occasionally synchronize, incurring a high penalty. The goal of compiling and optimizing for decoupled architectures is to partition the program between the asynchronous functional units in such a way that latencies are hidden but synchronization events are executed infrequently. This paper describes a model for decoupled compilation, and explains the effectiveness of compilation for decoupled systems. A number of new compiler optimizations are introduced and evaluated quantitatively using the Perfect Club scientific benchmarks. We show that with a suitable repertiore of optimizations, it is possible to hide large latencies most of the time for most of the programs in the Perfect Club.
机译:解耦的架构通过隐藏大内存延迟的能力提供了超级计算机持续性能问题的关键。当程序以解耦模式执行时,处理器处的感知内存延迟为零;有效地,整个物理内存的访问时间等于处理器的寄存器文件,并且延迟被完全隐藏。但是,解耦体系结构中的异步功能单元有时必须同步,这会带来很高的代价。为解耦的体系结构进行编译和优化的目标是,在异步功能单元之间对程序进行分区,以使延迟被隐藏,而同步事件却很少执行。本文介绍了一种用于解耦编译的模型,并解释了对解耦系统进行编译的有效性。引入了许多新的编译器优化,并使用Perfect Club科学基准对其进行了定量评估。我们表明,通过适当的优化,大多数时间,Perfect Club中的大多数程序都可以隐藏较大的延迟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号