首页> 外文学位 >Hardware optimizations enabled by a decoupled fetch architecture.
【24h】

Hardware optimizations enabled by a decoupled fetch architecture.

机译:解耦的获取体系结构实现了硬件优化。

获取原文
获取原文并翻译 | 示例

摘要

In the pursuit of instruction-level parallelism, significant demands are placed on a processor's instruction delivery mechanism. In order to provide the performance necessary to meet future processor execution targets, the instruction delivery mechanism must scale with the execution core. Attaining these targets is a challenging task due to I-cache misses, branch mispredictions, and taken branches in the instruction stream. Moreover, there are a number of hardware scaling issues such as wire latency, clock scaling, and energy dissipation that can impact processor design.; To address these issues, this thesis presents a fetch architecture that decouples the branch predictor from the instruction fetch unit. A Fetch Target Queue (FTQ) is inserted between the branch predictor and instruction cache. This allows the branch predictor to run far in advance of the address currently being fetched by the instruction cache. The decoupling enables a number of architectural optimizations including multi-level branch predictor design and fetch directed instruction prefetching.; A multi-level branch predictor design consists of a small first level predictor that can scale well to future technology sizes and larger higher level predictors that can provide capacity for accurate branch prediction.; Fetch directed instruction cache prefetching uses the stream of fetch addresses contained in the FTQ to guide instruction cache prefetching. By following the predicted fetch path, this technique provides more accurate prefetching than simply following a sequential fetch path.; Fetch directed prefetching using a contemporary set-associative instruction cache has some complexity and energy dissipation concerns. Set-associative caches provide a great deal of performance benefit, but dissipate a large amount of energy by blindly driving a number of associative ways. By decoupling the tag and data components of the instruction cache, a complexity effective and energy efficient scheme for fetch directed instruction cache prefetching can be enabled.; This thesis explores the decoupled front-end design and these related optimizations, and suggests future research directions.
机译:在追求指令级并行性时,对处理器的指令传递机制提出了很高的要求。为了提供满足未来处理器执行目标所必需的性能,指令传递机制必须随执行核心扩展。由于I高速缓存未命中,分支预测错误以及指令流中的已采取分支,因此实现这些目标是一项具有挑战性的任务。此外,还有许多硬件缩放问题,例如线路等待时间,时钟缩放和能量消耗,可能会影响处理器设计。为了解决这些问题,本文提出了一种提取架构,将分支预测器与指令提取单元解耦。提取目标队列(FTQ)插入在分支预测变量和指令高速缓存之间。这允许分支预测器在当前由指令高速缓存获取的地址之前运行。解耦使许多体系结构优化成为可能,包括多级分支预测器设计和获取定向指令预取。多级分支预测器设计包括一个小型的第一级预测器,可以很好地适应未来的技术规模;以及较大的高级预测器,可以提供准确的分支预测能力。定向访问指令高速缓存预取使用FTQ中包含的获取地址流来指导指令高速缓存预取。通过遵循预测的提取路径,与仅遵循顺序的提取路径相比,此技术可提供更准确的预提取。使用当代的集关联指令高速缓存进行定向直接预取具有一些复杂性和能量耗散问题。集关联高速缓存提供了很多性能优势,但是通过盲目地驱动许多关联方式来消耗大量能量。通过解耦指令高速缓存的标签和数据组件,可以实现用于取指令定向的指令高速缓存预取的复杂度有效且节能的方案。本文探讨了去耦的前端设计和相关优化,并提出了未来的研究方向。

著录项

  • 作者

    Reinman, Glenn D.;

  • 作者单位

    University of California, San Diego.;

  • 授予单位 University of California, San Diego.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2001
  • 页码 196 p.
  • 总页数 196
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号