Temporal instruction fetch streaming

机译：时间指令获取流媒体

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

L1 instruction-cache misses pose a critical performance bottleneck in commercial server workloads. Cache access latency constraints preclude L1 instruction caches large enough to capture the application, library, and OS instruction working sets of these workloads. To cope with capacity constraints, researchers have proposed instruction prefetchers that use branch predictors to explore future control flow. However, such prefetchers suffer from several fundamental flaws: their lookahead is limited by branch prediction bandwidth, their accuracy suffers from geometrically-compounding branch misprediction probability, and they are ignorant of the cache contents, frequently predicting blocks already present in L1. Hence, L1 instruction misses remain a bottleneck. We propose Temporal Instruction Fetch Streaming (TIFS)—a mechanism for prefetching temporally-correlated instruction streams from lower-level caches. Rather than explore a program’s control flow graph, TIFS predicts future instruction-cache misses directly, through recording and replaying recurring L1 instruction miss sequences. In this paper, we first present an information-theoretic offline trace analysis of instruction-miss repetition to show that 94% of L1 instruction misses occur in long, recurring sequences. Then, we describe a practical mechanism to record these recurring sequences in the L2 cache and leverage them for instruction-cache prefetching. Our TIFS design requires less than 5% storage overhead over the baseline L2 cache and improves performance by 11% on average and 24% at best in a suite of commercial server workloads.

机译：L1指令 - 缓存未命令在商业服务器工作负载中构成关键性能瓶颈。缓存访问延迟约束防止L1指令缓存足以捕获这些工作负载的应用程序，库和OS指令工作集。为了应对容量限制，研究人员已经提出了使用分支预测因子来探索未来控制流程的指令预取。然而，这种预取人遭受了几个基本缺陷：它们的看法受分支预测带宽的限制，它们的准确性遭受几何复合分支错误规范概率，并且它们是对L1中已经存在的高速缓存内容的忽略剂，频繁预测块已经存在于L1中。因此，L1指令未命中仍然是一个瓶颈。我们提出了用于从低级高速缓存预取时间相关指令流的时间指令获取流（TIFS）-A机制。通过录制和重放重复的L1指令未命中序列，TIFS而不是探索程序的控制流程图，而不是直接预测未来的指令缓存未命中。在本文中，我们首先提出了一种信息 - 理论离线轨迹分析的指令 - 错过重复，以显示94％的L1指令未命中的序列发生。然后，我们描述了一种在L2缓存中记录这些重复序列的实际机制，并利用它们进行指令缓存预取。我们TIFS设计要求将比基线L2缓存小于5％的存储开销和一套商用服务器工作负载的平均提高11％和24％，最好提高性能。

著录项

来源
《IEEE/ACM International Symposium on Microarchitecture》|2008年||共10页
会议地点
作者
Ferdman Michael; Wenisch Thomas F.; Ailamaki Anastasia; Falsafi Babak; Moshovos Andreas;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP313-53;
关键词
caching; fetch-directed; instruction streaming; prefetching;

机译：缓存;获取定向;指令流;预取;

相似文献

外文文献
中文文献
专利

1. Guaranteeing Instruction Fetch Behavior with a Lookahead Instruction Fetch Engine (LIFE) [J] . Hines S, Peress Y, Gavin P, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2009,第7期

机译：使用超前指令提取引擎（LIFE）保证指令提取行为
2. Addressing instruction fetch bottlenecks by using an instruction register file [J] . Stephen Roderick Hines, Gary Tyson, David Whalley ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2007,第7期

机译：使用指令寄存器文件解决指令提取瓶颈
3. Instruction Fetch Energy Reduction with Biased SRAMs [J] . Multanen Joonas, Viitanen Timo, Jaaskelainen Pekka, Journal of VLSI signal processing . 2018,第11期

机译：偏置SRAM降低了指令获取能量
4. Temporal instruction fetch streaming [C] . Ferdman Michael, Wenisch Thomas F., Ailamaki Anastasia, IEEE/ACM International Symposium on Microarchitecture . 2008

机译：时间指令获取流媒体
5. Proactive Instruction Fetch [D] . Ferdman, Michael. 2012

机译：主动指令获取
6. Verbal instructional sets to normalise the temporal and spatialgait variables in Parkinsons disease [O] . A. Behrman, P. Teitelbaum, J. Cauraugh 1998

机译：口语教学集以规范时空帕金森氏病的步态变量
7. Temporal instruction fetch streaming [O] . Michael Ferdman, Thomas F. Wenisch, Anastasia Ailamaki, 2008

机译：时间指令获取流

Temporal instruction fetch streaming

摘要

著录项

相似文献

相关主题

期刊订阅