Hardware optimizations enabled by a decoupled fetch architecture.

机译：解耦的获取体系结构实现了硬件优化。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the pursuit of instruction-level parallelism, significant demands are placed on a processor's instruction delivery mechanism. In order to provide the performance necessary to meet future processor execution targets, the instruction delivery mechanism must scale with the execution core. Attaining these targets is a challenging task due to I-cache misses, branch mispredictions, and taken branches in the instruction stream. Moreover, there are a number of hardware scaling issues such as wire latency, clock scaling, and energy dissipation that can impact processor design.; To address these issues, this thesis presents a fetch architecture that decouples the branch predictor from the instruction fetch unit. A Fetch Target Queue (FTQ) is inserted between the branch predictor and instruction cache. This allows the branch predictor to run far in advance of the address currently being fetched by the instruction cache. The decoupling enables a number of architectural optimizations including multi-level branch predictor design and fetch directed instruction prefetching.; A multi-level branch predictor design consists of a small first level predictor that can scale well to future technology sizes and larger higher level predictors that can provide capacity for accurate branch prediction.; Fetch directed instruction cache prefetching uses the stream of fetch addresses contained in the FTQ to guide instruction cache prefetching. By following the predicted fetch path, this technique provides more accurate prefetching than simply following a sequential fetch path.; Fetch directed prefetching using a contemporary set-associative instruction cache has some complexity and energy dissipation concerns. Set-associative caches provide a great deal of performance benefit, but dissipate a large amount of energy by blindly driving a number of associative ways. By decoupling the tag and data components of the instruction cache, a complexity effective and energy efficient scheme for fetch directed instruction cache prefetching can be enabled.; This thesis explores the decoupled front-end design and these related optimizations, and suggests future research directions.

机译：在追求指令级并行性时，对处理器的指令传递机制提出了很高的要求。为了提供满足未来处理器执行目标所必需的性能，指令传递机制必须随执行核心扩展。由于I高速缓存未命中，分支预测错误以及指令流中的已采取分支，因此实现这些目标是一项具有挑战性的任务。此外，还有许多硬件缩放问题，例如线路等待时间，时钟缩放和能量消耗，可能会影响处理器设计。为了解决这些问题，本文提出了一种提取架构，将分支预测器与指令提取单元解耦。提取目标队列（FTQ）插入在分支预测变量和指令高速缓存之间。这允许分支预测器在当前由指令高速缓存获取的地址之前运行。解耦使许多体系结构优化成为可能，包括多级分支预测器设计和获取定向指令预取。多级分支预测器设计包括一个小型的第一级预测器，可以很好地适应未来的技术规模；以及较大的高级预测器，可以提供准确的分支预测能力。定向访问指令高速缓存预取使用FTQ中包含的获取地址流来指导指令高速缓存预取。通过遵循预测的提取路径，与仅遵循顺序的提取路径相比，此技术可提供更准确的预提取。使用当代的集关联指令高速缓存进行定向直接预取具有一些复杂性和能量耗散问题。集关联高速缓存提供了很多性能优势，但是通过盲目地驱动许多关联方式来消耗大量能量。通过解耦指令高速缓存的标签和数据组件，可以实现用于取指令定向的指令高速缓存预取的复杂度有效且节能的方案。本文探讨了去耦的前端设计和相关优化，并提出了未来的研究方向。

著录项

作者
Reinman, Glenn D.;
展开▼
作者单位

University of California, San Diego.;

展开▼
授予单位 University of California, San Diego.;
学科 Computer Science.
学位 Ph.D.
年度 2001
页码 196 p.
总页数 196
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Untrusted Hardware Causes Double-Fetch Problems in the I/O Memory [J] . Kai Lu, Peng-Fei Wang, Gen Li, 计算机科学技术学报（英文版） . 2018,第003期

机译：不受信任的硬件会导致I / O内存出现双重提取问题
2. Reliability Improvement of Hardware Task Graphs via Configuration Early Fetch [J] . Reza Ramezani, Yasser Sedaghat, Juan Antonio Clemente Very Large Scale Integration (VLSI) Systems, IEEE Transactions on . 2017,第4期

机译：通过配置早期获取提高了硬件任务图的可靠性
3. Hardware-in-the-Loop Test of an Open-Loop Fuzzy Control Method for Decoupled Electrohydraulic Antilock Braking System [J] . Aksjonov Andrei, Ricciardi Vincenzo, Augsburg Klaus, IEEE Transactions on Fuzzy Systems . 2021,第5期

机译：隔断电液防水防袋制动系统开环模糊控制方法的硬件 - 环路试验
4. Optimizing a Decoupled Front-End Architecture: The Indexed Fetch Target Buffer (iFTB) [C] . Juan C. Moure, Dolores I. Rexachs, Emilio Luque, European Conference on Parallel Computing . 2003

机译：优化解耦前端架构：索引的获取目标缓冲区（IFTB）
5. Decoupled Vector-Fetch Architecture with a Scalarizing Compiler. [D] . Lee, Yunsup. 2016

机译：具有标量编译器的去耦矢量提取架构。
6. Hardware-Intrinsic Multi-Layer Security: A New Frontier for 5G Enabled IIoT [O] . Hussain Al-Aqrabi, Anju P. Johnson, Richard Hill, 2020

机译：硬件固有的多层安全性：支持5G的IIoT的新前沿
7. Topology Optimization for Additive Manufacturing as an Enabler for Light Weight Flight Hardware [O] . Melissa Orme, Ivan Madera, Michael Gschweitl, 2018

机译：添加剂制造的拓扑优化作为轻量级飞行硬件的推动器

Hardware optimizations enabled by a decoupled fetch architecture.

摘要

著录项

相似文献

相关主题

期刊订阅