Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

机译：通过添加小的全关联高速缓存和预取缓冲区来提高直接映射的高速缓存性能

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Projections of computer technology forecast processors with peak performance of 1,000 MIPS in the relatively near future. These processors could easily lose half or more of their performance in the memory hierarchy if the hierarchy design is based on conventional caching techniques. This paper presents hardware techniques to improve the performance of caches.

Miss caching places a small fully-associative cache between a cache and its refill path. Misses in the cache that hit in the miss cache have only a one cycle miss penalty, as opposed to a many cycle miss penalty without the miss cache. Small miss caches of 2 to 5 entries are shown to be very effective in removing mapping conflict misses in first-level direct-mapped caches.

Victim caching is an improvement to miss caching that loads the small fully-associative cache with the victim of a miss and not the requested line. Small victim caches of 1 to 5 entries are even more effective at removing conflict misses than miss caching.

Stream buffers prefetch cache lines starting at a cache miss address. The prefetched data is placed in the buffer and not in the cache. Stream buffers are useful in removing capacity and compulsory cache misses, as well as some instruction cache conflict misses. Stream buffers are more effective than previously investigated prefetch techniques at using the next slower level in the memory hierarchy when it is pipelined. An extension to the basic stream buffer, called multi-way stream buffers, is introduced. Multi-way stream buffers are useful for prefetching along multiple intertwined data reference streams.

Together, victim caches and stream buffers reduce the miss rate of the first level in the cache hierarchy by a factor of two to three on a set of six large benchmarks.

机译：

计算机技术的预测预测在相对不久的将来，处理器的峰值性能将达到1,000 MIPS。如果层次结构设计基于常规缓存技术，则这些处理器可能会轻易失去其在内存层次结构中一半或更多的性能。本文提出了提高缓存性能的硬件技术。

小姐缓存在缓存及其重新填充路径之间放置了一个小型的全关联缓存。与未命中高速缓存中没有命中的多个周期未命中相比，在未命中高速缓存中命中的高速缓存中的未命中只有一个周期的未命中代价。事实证明，由2至5个条目组成的小型未命中高速缓存对于消除一级直接映射高速缓存中的映射冲突未命中非常有效。

受害者缓存是对未命中缓存的一种改进，它将未命中的受害者而不是请求的行加载到小型的全关联缓存中。小型的1到5个条目的受害者缓存比删除未命中缓存更有效地消除冲突未命中。

流缓冲区预取从缓存未命中地址开始的缓存行。预取的数据放置在缓冲区中，而不是在缓存中。流缓冲区对于消除容量和强制性高速缓存未命中以及某些指令高速缓存冲突未命中很有用。当使用流水线处理时，在使用内存层次结构中的下一个较慢级别时，流缓冲区比以前研究的预取技术更有效。介绍了对基本流缓冲区的扩展，称为多路流缓冲区。多路流缓冲区对于沿着多个相互交织的数据参考流进行预取非常有用。

在一组六个大型基准测试中，受害缓存和流缓冲区一起将缓存层次结构中第一级的未命中率降低了2到3倍。展开▼

著录项

来源
《Annual international symposium on Computer Architecture;International symposium on Computer Architecture》|1990年|P.364-373|共10页
会议地点
作者
Norman P. Jouppi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类总体结构、系统结构;
关键词

相似文献

外文文献
中文文献
专利

1. Selective victim caching: a method to improve the performance of direct-mapped caches [J] . Stiliadis D., Varma A. IEEE Transactions on Computers . 1997,第5期

机译：选择性受害者缓存：一种提高直接映射缓存性能的方法
2. Improving Data Cache Performance with Integrated Use of Split Caches, Victim Cache and Stream Buffers [J] . Afrin Naz, Mehran Rezaei, Krishna Kavi, Computer architecture news . 2005,第3期

机译：通过结合使用拆分缓存，受害者缓存和流缓冲区来提高数据缓存性能
3. Improving Trace Cache Processor Performance by Trace Cache Hierarchy and Path-based Trace Prefetch [J] . WANG, Kaifeng, JI, 电子学报：英文版 . 2006,第002期

机译：通过跟踪缓存层次结构和基于路径的跟踪预取来提高跟踪缓存处理器的性能
4. Retrospective: improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers [C] . Norman P. Jouppi 25 years of the international symposia on Computer architecture . 1998

机译：回顾：通过添加小型全关联缓存和预取缓冲区来提高直接映射的缓存性能
5. Improving memory hierarchy performance with hardware prefetching and cache replacement. [D] . Lin, Wei-Fen. 2002

机译：通过硬件预取和缓存替换来提高内存层次结构的性能。
6. Combining Instruction Prefetching with Partial Cache Locking to Improve WCET in Real-Time Systems [O] . Fan Ni, Xiang Long, Han Wan, -1

机译：将指令预取与部分缓存锁定相结合以改善实时系统中的WCET
7. RETROSPECTIVE: Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers [O] . 2009

机译：回顾：通过增加一个小型的全关联缓存和预取缓冲区，提高直接映射的缓存性能
8. Effectiveness of Caches and Data Prefetch Buffers in Large-Scale Shared Memory Multiprocessors [R] . Lee, R. L. 1987

机译：大规模共享存储器多处理器中高速缓存和数据预取缓冲区的有效性

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

摘要

著录项

相似文献

相关主题

期刊订阅