CPU cache prefetching: Timing evaluation of hardware implementations

Tse J.; Smith A.J.

首页> 外文期刊>IEEE Transactions on Computers >CPU cache prefetching: Timing evaluation of hardware implementations

【24h】

CPU cache prefetching: Timing evaluation of hardware implementations

机译：CPU缓存预取：硬件实现的时序评估

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Prefetching into CPU caches has long been known to be effective in reducing the cache miss ratio, but known implementations of prefetching have been unsuccessful in improving CPU performance. The reasons for this are that prefetches interfere with normal cache operations by making cache address and data ports busy, the memory bus busy, the memory banks busy, and by not necessarily being complete by the time that the prefetched data is actually referenced. In this paper, we present extensive quantitative results of a detailed cycle-by-cycle trace-driven simulation of a uniprocessor memory system in which we vary most of the relevant parameters in order to determine when and if hardware prefetching is useful. We find that, in order for prefetching to actually improve performance, the address array needs to be double ported and the data array needs to either be double ported or fully buffered. It is also very helpful for the bus to be very wide (e.g., 16 bytes) for bus transactions to be split and for main memory to be interleaved. Under the best circumstances, i.e., with a significant investment in extra hardware, prefetching can significantly improve performance. For implementations without adequate hardware, prefetching often decreases performance.

机译：早就知道预取到CPU缓存中可以有效降低缓存未命中率，但是已知的预取实现并不能成功地提高CPU性能。这样做的原因是，预取通过使高速缓存地址和数据端口繁忙，存储器总线繁忙，存储器组繁忙以及在实际引用预取数据之前未必完成而干扰正常的高速缓存操作。在本文中，我们给出了单处理器存储系统逐周期跟踪驱动模拟的详细定量结果，在该模拟中，我们改变了大多数相关参数，以确定何时以及是否有用硬件预取。我们发现，为了使预取能够真正提高性能，地址阵列需要进行双端口移植，而数据阵列需要进行双端口移植或完全缓冲。总线非常宽（例如16字节）对于总线事务被拆分以及对主存储器的交织也是非常有帮助的。在最佳情况下，即，在额外的硬件上进行了大量投资时，预取可以显着提高性能。对于没有足够硬件的实现，预取通常会降低性能。

著录项

来源
《IEEE Transactions on Computers》 |1998年第5期|P.509-526|共18页
作者
Tse J.; Smith A.J.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. An intelligent cache system with hardware prefetching for high performance [J] . Jung-Hoon Lee, Seh-woong Jeong, Shin-Dug Kim, IEEE Transactions on Computers . 2003,第5期

机译：具有硬件预取功能的高性能智能缓存系统
2. Increasing hardware data prefetching performance using the second-level cache [J] . Nathalie Drach, Jean-Luc Bechennec Journal of systems architecture . 2002,第4a5期

机译：使用二级缓存提高硬件数据预取性能
3. Hardware and software cache prefetching techniques for MPEG benchmarks [J] . Zucker D.F., Lee R.B. IEEE Transactions on Circuits and Systems for Video Technology . 2000,第5期

机译：MPEG基准测试的硬件和软件缓存预取技术
4. Prefetch-guard: Leveraging hardware prefetches to defend against cache timing channels [C] . Hongyu Fang, Sai Santosh Dayapule, Fan Yao, IEEE International Symposium on Hardware Oriented Security and Trust . 2018

机译：Prefetch-guard：利用硬件预取来防御高速缓存定时通道
5. Redesigning database systems in light of CPU cache prefetching [D] . Chen, Shimin 2005

机译：根据CPU缓存预取重新设计数据库系统
6. Implementation and Evaluation of Open-Source Hardware to Monitor Water Quality in Precision Aquaculture [O] . Rafael Apolinar Bórquez López, Luis Rafael Martinez Cordova, Juan Carlos Gil Nuñez, 2020

机译：开源硬件的实施与评估以监测精密水产养殖水质
7. Differential cache-collision timing attacks on AES with applications to embedded CPUs [O] . Bogdanov Andrey, Eisenbarth Thomas, Paar Christof, 2010

机译：嵌入式应用程序对AES的差分缓存冲突定时攻击
8. Preliminary Evaluation of Cache-Miss-Initiated Prefetching Techniques in Scalable Multiprocessors. [R] . Bianchini, R., LeBlanc, T. J. 1994

机译：可伸缩多处理器中Cache-miss-Initiated预取技术的初步评估。

CPU cache prefetching: Timing evaluation of hardware implementations

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅