首页> 外文会议>IEEE Pacific Rim Conference on Communications, Computers and Signal Processing >Cache prefetching and speculation on multi-threaded processors
【24h】

Cache prefetching and speculation on multi-threaded processors

机译:多线程处理器上的缓存预取和推测

获取原文

摘要

Data prefetching is an important mechanism for hiding memory latency in single-threaded, desktop workloads. For multi-threaded, commercial workloads, prefetching offers much more modest improvements in performance at a high cost in cache power and bandwidth to the higher level caches. This paper shows that by combining speculation with a selective prefetching scheme, we can reduce the cache access power overhead while improving performance. We demonstrate that “likely-to-miss” load instructions can be accurately identified and we propose two hardware-based techniques for improving load latencies in multi-threaded commercial workloads. First, we modify a next-four-lines prefetching scheme to only perform the prefetch for likely-to-miss loads. Second, we forward addresses for likely-to-miss loads to the L2 and L3 caches for tag look-up immediately after address translation. Combined, these two techniques reduce the extra cache access power of the L3 cache by up to 53% while slightly improving performance when compared with a simple next-four-lines prefetcher running standard, commercial-workload benchmarks.
机译:数据预取是一种用于隐藏单线程台式机工作负载中的内存延迟的重要机制。对于多线程的商业工作负载,预取以更高的缓存能力和更高级别缓存的带宽成本提供了性能方面的适度改进。本文表明,通过将推测与选择性预取方案相结合,我们可以在提高性能的同时减少缓存访问功率的开销。我们证明了可以准确识别“可能丢失”的加载指令,并提出了两种基于硬件的技术来改善多线程商业工作负载中的加载延迟。首先,我们修改后四行的预取方案,以仅对可能丢失的负载执行预取。其次,我们在地址转换后立即将可能丢失的地址转发给L2和L3缓存,以进行标签查找。与运行标准的商业工作负载基准的简单的下四行预取器相比,这两种技术相结合,可将L3高速缓存的额外高速缓存访​​问能力降低多达53%,同时略微提高了性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号