Mitigating Critical Path Decompression Latency in Compressed L1 Data Caches Via Prefetching

机译：通过预取减轻压缩L1数据缓存中的关键路径解压缩延迟

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Increasing the size of cache memory is a common approach for reducing miss rates and increasing performance in a CPU. Doing this, however, increases the static and dynamic energy consumption of the cache. Compression can be utilized to increase the effective capacity of cache memory without physically increasing its size. We can also use compression to reduce the physical size of the cache, and therefore reduce its energy consumption, while maintaining a reasonable effective cache capacity. Unfortunately, a decompression latency is experienced when accessing the compressed data. This affects the critical execution path of the processor and can have a significant impact on performance, especially when implemented in L1 cache. Previous work has used cache prefetching techniques to hide the latency of lower level memory accesses. Our work proposes the combination of data prefetching and compression techniques to reduce the impact of decompression latency and improve the feasibility of compression in L1 caches. We evaluate the performance of Last Outcome (LO), Stride (S), and Two-Level (2L) prefetching, as well as hybrid combinations of these methods (S/LO & 2L/S), in combination with Base-Delta-Immediate (B Δ I) compression. The results demonstrate that using B Δ I, in combination with data prefetching, provides performance improvement over BΔI compression alone in L1 data cache. We find that a 4KB Hybrid S/LO prefetcher results in an average speedup of 1.7% and improvement to the energy-delay product of the CPU by 1.5% versus B Δ I alone.

机译：增加高速缓冲存储器的大小是降低CPU中的错过率和提高性能的常用方法。但是，这样做，增加了缓存的静态和动态能耗。压缩可用于增加高速缓冲存储器的有效容量，而不会物理增加其尺寸。我们还可以使用压缩来减少缓存的物理大小，从而降低其能量消耗，同时保持合理的有效缓存容量。不幸的是，在访问压缩数据时经历了减压延迟。这会影响处理器的临界执行路径，并且可以对性能产生显着影响，尤其是在L1缓存中实现时。以前的工作已经使用缓存预取技术来隐藏较低级别存储器访问的延迟。我们的工作提出了数据预取和压缩技术的结合，以降低减压延迟的影响，提高L1缓存中压缩的可行性。我们评估最后一切（LO），步幅和两级（2L）预取的性能，以及这些方法的混合组合（S / LO＆2L / s），与基础 - Δ相结合立即（bδi）压缩。结果表明，使用BΔI与数据预取结合，在L1数据高速缓存中单独提供对BδI压缩的性能改进。我们发现，4KB混合S / LO预取器导致平均加速1.7 ％并改善CPU的能量延迟乘积1.5 ％与BΔI单独。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium Workshops》|2018年|665p|共8页
会议地点
作者
Sean Rea; Ehsan Atoofian;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.133;
关键词
Prefetching; Benchmark testing; Hardware; Energy consumption; Cache memory; Image coding; Encoding;

机译：预取;基准测试;硬件;能量消耗;缓存内存;图像编码;编码;

相似文献

外文文献
中文文献
专利

1. Aging mitigation of L1 cache by exchanging instruction and data caches [J] . Sadeghi Mohammad, Nikmehr Hooman Integration . 2018,第juna期

机译：通过交换指令和数据缓存来缓解L1缓存的老化
2. Improving Trace Cache Processor Performance by Trace Cache Hierarchy and Path-based Trace Prefetch [J] . WANG, Kaifeng, JI, 电子学报：英文版 . 2006,第002期

机译：通过跟踪缓存层次结构和基于路径的跟踪预取来提高跟踪缓存处理器的性能
3. Enhancing the L1 Data Cache Design to Mitigate HCI [J] . Alejandro Valero, Negar Miralaei, Salvador Petit, IEEE computer architecture letters . 2016,第2期

机译：增强L1数据缓存设计以减轻HCI
4. Mitigating Critical Path Decompression Latency in Compressed L1 Data Caches Via Prefetching [C] . Sean Rea, Ehsan Atoofian IEEE International Parallel and Distributed Processing Symposium Workshops . 2018

机译：通过预取缓解压缩的L1数据缓存中的关键路径解压缩延迟
5. Improving power of l1 data cache and register file utilizing critical path instructions. [D] . Chen, Kuang-Lun. 2014

机译：利用关键路径指令提高l1数据高速缓存和寄存器文件的功能。
6. Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations [O] . Gregory P. Way, Michael Zietz, Vincent Rubinetti, 2020

机译：使用多个潜在空间维度压缩基因表达数据可学习互补的生物学表示
7. Research data supporting "Enhancing the L1 Data Cache Design to Mitigate HCI" [O] . Valero, Alejandro, Miralaei, Negar, Petit, Salvador, 2015

机译：支持“增强L1数据缓存设计以减轻HCI”的研究数据

Mitigating Critical Path Decompression Latency in Compressed L1 Data Caches Via Prefetching

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅