BEAR: Techniques for Mitigating Bandwidth Bloat in Gigascale DRAM Caches

Chiachen Chout; Aamer Jaleel; Moinuddin K. Qureshi

首页> 外文期刊>Computer architecture news >BEAR: Techniques for Mitigating Bandwidth Bloat in Gigascale DRAM Caches

【24h】

BEAR: Techniques for Mitigating Bandwidth Bloat in Gigascale DRAM Caches

机译：BEAR：减轻千兆级DRAM缓存中带宽膨胀的技术

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Die stacking memory technology can enable gigascale DRAM caches that can operate at 4x-8x higher bandwidth than commodity DRAM. Such caches can improve system performance by servicing data at a faster rate when the requested data is found in the cache, potentially increasing the memory bandwidth of the system by 4x-8x. Unfortunately, a DRAM cache uses the available memory bandwidth not only for data transfer on cache hits, but also for other secondary operations such as cache miss detection, fill on cache miss, and writeback lookup and content update on dirty evictions from the last-level on-chip cache. Ideally, we want the bandwidth consumed for such secondary operations to be negligible, and have almost all the bandwidth be available for transfer of useful data from the DRAM cache to the processor. We evaluate a 1GB DRAM cache, architected as Alloy Cache, and show that even the most bandwidth-efficient proposal for DRAM cache consumes 3.8x bandwidth compared to an idealized DRAM cache that does not consume any bandwidth for secondary operations. We also show that redesigning the DRAM cache to minimize the bandwidth consumed by secondary operations can potentially improve system performance by 22%. To that end, this paper proposes Bandwidth Efficient ARchitecture (BEAR) for DRAM caches. BEAR integrates three components, one each for reducing the bandwidth consumed by miss detection, miss fill, and writeback probes. BEAR reduces the bandwidth consumption of DRAM cache by 32%, which reduces cache hit latency by 24% and increases overall system performance by 10%. BEAR, with negligible overhead, outperforms an idealized SRAM Tag-Store design that incurs an unacceptable overhead of 64 megabytes, as well as Sector Cache designs that incur an SRAM storage overhead of 6 megabytes.

机译：芯片堆叠存储器技术可以实现千兆级DRAM高速缓存，该高速缓存的带宽可以比商用DRAM高4到8倍。当在高速缓存中找到请求的数据时，此类高速缓存可以通过以更快的速率为数据提供服务来提高系统性能，从而有可能将系统的内存带宽提高4到8倍。不幸的是，DRAM高速缓存不仅将可用内存带宽用于高速缓存命中时的数据传输，而且还用于其他辅助操作，例如高速缓存未命中检测，高速缓存未命中填充以及回写查找和内容更新（从最后一级逐出）片上缓存。理想情况下，我们希望这些二级操作消耗的带宽可以忽略不计，并且几乎所有带宽都可用于将有用数据从DRAM高速缓存传输到处理器。我们评估了构造为Alloy Cache的1GB DRAM缓存，并显示，即使是带宽效率最高的DRAM缓存建议也要消耗3.8倍的带宽，而理想化的DRAM缓存不消耗任何带宽用于二级操作。我们还表明，重新设计DRAM缓存以最大程度地减少次要操作消耗的带宽可以使系统性能提高22％。为此，本文提出了用于DRAM高速缓存的带宽有效架构（BEAR）。 BEAR集成了三个组件，每个组件用于减少未命中检测，未命中填充和回写探针所消耗的带宽。 BEAR将DRAM缓存的带宽消耗降低了32％，这将缓存命中延迟减少了24％，并将整个系统性能提高了10％。 BEAR的开销可忽略不计，其性能优于理想的SRAM Tag-Store设计，后者导致了64 MB的不可接受的开销，以及Sector Cache设计，其导致了6 MB的SRAM存储开销。

著录项

来源
《Computer architecture news》 |2015年第3期|198-210|共13页
作者
Chiachen Chout; Aamer Jaleel; Moinuddin K. Qureshi;
展开▼
作者单位

School of Electrical and Computer Engineering Georgia Institute of Technology;

NVIDIA;

School of Electrical and Computer Engineering Georgia Institute of Technology;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Die-Stacked DRAM Caches for Servers Hit Ratio, Latency, or Bandwidth? Have It All with Footprint Cache [J] . Djordje Jevdjic, Stavros Volos, Babak Falsafi Computer architecture news . 2013,第3期

机译：芯片堆叠式DRAM缓存是针对服务器的命中率，延迟还是带宽？拥有足迹缓存
2. Selective DRAM cache bypassing for improving bandwidth on DRAM/NVM hybrid main memory systems [J] . Yuhwan Ro, Minchul Sung, Yongjun Park, IEICE Electronics Express . 2017,第11期

机译：选择性DRAM高速缓存绕过，可提高DRAM / NVM混合主存储系统的带宽
3. DICE: Compressing DRAM Caches for Bandwidth and Capacity [J] . Vinson Young, Prashant J. Nair, Moinuddin K. Qureshi Computer architecture news . 2017,第2期

机译：DICE：压缩DRAM高速缓存以获得带宽和容量
4. BEAR: Techniques for mitigating bandwidth bloat in gigascale DRAM caches [C] . Chou Chiachen, Jaleel Aamer, Qureshi Moinuddin K. 42th Annual International Symposium on Computer Architecture . 2015

机译：熊：减轻千兆级DRAM缓存中带宽膨胀的技术
5. Dynamic Techniques for Mitigating Inter- and Intra-Application Cache Interference. [D] . Wu, Carole-Jean. 2012

机译：缓解应用程序间和应用程序内缓存干扰的动态技术。
6. In-DRAM Cache Management for Low Latency and Low Power 3D-Stacked DRAMs [O] . Ho Hyun Shin, Eui-Young Chung 2019

机译：用于低延迟和低功耗3D堆叠DRAM的DRAM中缓存管理
7. To Update or Not To Update?: Bandwidth-Efficient Intelligent Replacement Policies for DRAM Caches [O] . Vinson Young, Moinuddin K. Qureshi 2019

机译：更新或不更新？DRAM缓存的带宽高效智能替换策略

BEAR: Techniques for Mitigating Bandwidth Bloat in Gigascale DRAM Caches

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅