首页> 外文学位 >Compiler techniques for evaluating and extending decoupled architectures (Data prefetching).
【24h】

Compiler techniques for evaluating and extending decoupled architectures (Data prefetching).

机译:用于评估和扩展解耦架构的编译器技术(数据预取)。

获取原文
获取原文并翻译 | 示例

摘要

Decoupled processing was first developed in 1981 to overcome the problem of increasing memory latencies. In the decoupled processing model, the compiler separates a program into those operations necessary to access memory (the access stream), and those operations necessary to perform the actual computational work of the program (the execute stream). These instruction streams are then executed on separate but cooperating processors, ideally allowing the access stream to slip ahead of the execute stream and issue requests to memory far in advance of the consumption of the data. This “slipping” is made possible because of the extensive use of queues to buffer communication between the processors, and between the processors and memory.; Early studies of decoupled processing using small, hand-compiled benchmarks indicated significant speed-up was possible. However, large benchmarks which are compiler generated are necessary to determine if the performance potential is truly realizable. In this dissertation the development of Daecomp , a compiler for decoupled access/execute architectures, is described, focusing on those techniques developed which are unique to decoupled processing (code partitioning, interprocessor copy operations, mitigating code expansion, the handling of function calls, etc.).; Simulation results obtained using Daecomp-generated code revealed results consistent with published results for the Livermore Loops. Significantly poorer results were obtained using two sets of larger benchmarks underscoring the hazards of relying on small, hand-compiled benchmarks for architectural evaluations. Investigations were performed in order to determine the causes of the poor performance, and branch operations which force the access processor to wait on the execute processor were identified as the primary culprit.; Based on these results, the decoupled-style prefetch architecture (D-SPA) is proposed. D-SPA leverages the strengths and compiler techniques of decoupled processing, while permitting the use of advanced branch prediction and speculative execution techniques which are unavailable in the decoupled model. Despite using the decoupled model as a basis, D-SPA is shown to easily maintain binary compatibility with a typical RISC processor.; The results of investigations into the performance potential of D-SPA using large commercial benchmarks indicate that D-SPA has the potential to effectively prefetch data, significantly reducing the cache miss rate and increasing system performance.
机译:解耦处理最早是在1981年开发的,旨在解决增加内存延迟的问题。在解耦处理模型中,编译器将程序分为访问存储器所需的那些操作(访问流)和执行程序的实际计算工作所需的那些操作(执行流)。这些指令流然后在单独的但相互协作的处理器上执行,理想情况下,允许访问流在执行流之前滑移,并在数据消耗之前就向内存发出请求。由于大量使用队列来缓冲处理器之间以及处理器与内存之间的通信,因此这种“滑动”成为可能。早期的研究使用小型手工编译的基准进行了解耦处理,结果表明可以显着提高速度。但是,必须使用编译器生成的大型基准来确定性能潜力是否可以真正实现。在这篇论文中,我们描述了 Daecomp 的开发过程,它是一种用于解耦的访问/执行体系结构的编译器,着重于开发的技术,这些技术对于解耦处理(代码分区,处理器间复制操作,缓解代码扩展,处理函数调用等)。使用Daecomp生成的代码获得的仿真结果表明,该结果与Livermore Loops的已发布结果一致。使用两组较大的基准得出的结果明显较差,这突出了依赖手工编制的小型基准进行体系结构评估的危害。为了确定性能低下的原因,进行了调查,并且将迫使访问处理器等待执行处理器运行的分支操作确定为主要罪魁祸首。基于这些结果,提出了分离式预取架构(D-SPA)。 D-SPA利用了解耦处理的优势和编译器技术,同时允许使用在解耦模型中不可用的高级分支预测和推测执行技术。尽管使用去耦模型作为基础,但D-SPA仍可轻松保持与典型RISC处理器的二进制兼容性。使用大型商业基准测试对D-SPA的性能潜力进行的调查结果表明,D-SPA有潜力有效地预取数据,从而显着降低缓存未命中率并提高系统性能。

著录项

  • 作者

    Rich, Kevin Donald.;

  • 作者单位

    University of California, Davis.;

  • 授予单位 University of California, Davis.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2000
  • 页码 205 p.
  • 总页数 205
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号