Mitigating Critical Path Decompression Latency in Compressed L1 Data Caches Via Prefetching

机译：通过预取缓解压缩的L1数据缓存中的关键路径解压缩延迟

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Increasing the size of cache memory is a common approach for reducing miss rates and increasing performance in a CPU. Doing this, however, increases the static and dynamic energy consumption of the cache. Compression can be utilized to increase the effective capacity of cache memory without physically increasing its size. We can also use compression to reduce the physical size of the cache, and therefore reduce its energy consumption, while maintaining a reasonable effective cache capacity. Unfortunately, a decompression latency is experienced when accessing the compressed data. This affects the critical execution path of the processor and can have a significant impact on performance, especially when implemented in L1 cache. Previous work has used cache prefetching techniques to hide the latency of lower level memory accesses. Our work proposes the combination of data prefetching and compression techniques to reduce the impact of decompression latency and improve the feasibility of compression in L1 caches. We evaluate the performance of Last Outcome (LO), Stride (S), and Two-Level (2L) prefetching, as well as hybrid combinations of these methods (S/LO & 2L/S), in combination with Base-Delta-Immediate (B Δ I) compression. The results demonstrate that using B Δ I, in combination with data prefetching, provides performance improvement over BΔI compression alone in L1 data cache. We find that a 4KB Hybrid S/LO prefetcher results in an average speedup of 1.7% and improvement to the energy-delay product of the CPU by 1.5% versus B Δ I alone.

机译：增加高速缓存的大小是减少未命中率和提高CPU性能的一种常用方法。但是，这样做会增加缓存的静态和动态能耗。可以利用压缩来增加高速缓存的有效容量，而无需实际增加其大小。我们还可以使用压缩来减小缓存的物理大小，从而减少其能耗，同时保持合理的有效缓存容量。不幸的是，访问压缩数据时会遇到解压缩延迟。这会影响处理器的关键执行路径，并且可能会对性能产生重大影响，尤其是在L1高速缓存中实施时。先前的工作使用缓存预取技术来隐藏较低级别的内存访问的延迟。我们的工作提出了数据预取和压缩技术的结合，以减少解压缩延迟的影响并提高L1缓存中压缩的可行性。我们评估了最后结果（LO），步幅（S）和两级（2L）预取的性能，以及这些方法（S / LO和2L / S）的混合组合以及Base-Delta-立即压缩（BΔI）。结果表明，结合使用BΔI和数据预取，可以在L1数据高速缓存中提供比单独的BΔI压缩更高的性能。我们发现，与单独的BΔI相比，4KB混合S / LO预取器的平均速度提高了1.7％，CPU的能量延迟积提高了1.5％。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium Workshops》|2018年|694-701|共8页
会议地点
作者
Sean Rea; Ehsan Atoofian;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Prefetching; Benchmark testing; Hardware; Energy consumption; Cache memory; Image coding; Encoding;

机译：预取;基准测试;硬件;能耗;缓存;图像编码;编码;

相似文献

外文文献
中文文献
专利

1. Aging mitigation of L1 cache by exchanging instruction and data caches [J] . Sadeghi Mohammad, Nikmehr Hooman Integration . 2018,第juna期

机译：通过交换指令和数据缓存来缓解L1缓存的老化
2. Improving Trace Cache Processor Performance by Trace Cache Hierarchy and Path-based Trace Prefetch [J] . WANG, Kaifeng, JI, 电子学报：英文版 . 2006,第002期

机译：通过跟踪缓存层次结构和基于路径的跟踪预取来提高跟踪缓存处理器的性能
3. Enhancing the L1 Data Cache Design to Mitigate HCI [J] . Alejandro Valero, Negar Miralaei, Salvador Petit, IEEE computer architecture letters . 2016,第2期

机译：增强L1数据缓存设计以减轻HCI
4. Mitigating Critical Path Decompression Latency in Compressed L1 Data Caches Via Prefetching [C] . Sean Rea, Ehsan Atoofian IEEE International Parallel and Distributed Processing Symposium Workshops . 2018

机译：通过预取减轻压缩L1数据缓存中的关键路径解压缩延迟
5. Improving power of l1 data cache and register file utilizing critical path instructions. [D] . Chen, Kuang-Lun. 2014

机译：利用关键路径指令提高l1数据高速缓存和寄存器文件的功能。
6. Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations [O] . Gregory P. Way, Michael Zietz, Vincent Rubinetti, 2020

机译：使用多个潜在空间维度压缩基因表达数据可学习互补的生物学表示
7. Research data supporting "Enhancing the L1 Data Cache Design to Mitigate HCI" [O] . Valero, Alejandro, Miralaei, Negar, Petit, Salvador, 2015

机译：支持“增强L1数据缓存设计以减轻HCI”的研究数据

Mitigating Critical Path Decompression Latency in Compressed L1 Data Caches Via Prefetching

摘要

著录项

相似文献

相关主题

期刊订阅