首页> 外文期刊>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems >Architecting On-Chip DRAM Cache for Simultaneous Miss Rate and Latency Reduction
【24h】

Architecting On-Chip DRAM Cache for Simultaneous Miss Rate and Latency Reduction

机译:设计片上DRAM高速缓存以同时降低丢失率和延迟

获取原文
获取原文并翻译 | 示例

摘要

On-chip dynamic random access memory (DRAM) cache has been recently employed in the memory hierarchy to mitigate the widening latency gap between high-speed cores and off-chip memory. Two important parameters are the DRAM cache miss rate (D$-MR) and the DRAM cache hit latency (D$-HL), as they strongly influence the performance. These parameters depend upon the DRAM set mapping policy. Recently proposed DRAM set mapping policies are predominantly optimized for either D$-MR or D$-HL. We propose novel DRAM set mapping policies that simultaneously reduce D$-MR (via high associativity) and D$-HL (via improved row buffer hit rates). To further improve the D$-HL, we propose a small and low latency DRAM Tag cache (DTC) structure that can quickly determine whether an access to the DRAM cache will be a hit or a miss. The performance of the proposed DTC depends upon the DTC hit rate. To increase it, we present a novel DTC insertion policy that also increases the DTC hit rate. We investigate the latency and miss rate tradeoffs when designing a DRAM cache hierarchy and analyze the effects of different policies on the overall performance. We evaluate our policies on a wide variety of workloads and compare its performance with three recent proposals for on-chip DRAM caches. For a 16-core system, our set mapping policy along with our DTC and its adaptive DTC insertion policy improve the harmonic mean instruction per cycle throughput by 25.4%, 15.5%, and 7.3% compared to state-of-the-art, while requiring 55% less storage overhead for DRAM cache hit/miss prediction.
机译:片上动态随机存取存储器(DRAM)缓存最近已在存储器层次结构中使用,以减轻高速内核与片外存储器之间不断扩大的等待时间差距。两个重要参数是DRAM高速缓存未命中率(D $ -MR)和DRAM高速缓存命中等待时间(D $ -HL),因为它们会严重影响性能。这些参数取决于DRAM集映射策略。最近提出的DRAM集映射策略主要针对D $ -MR或D $ -HL进行了优化。我们提出了新颖的DRAM集映射策略,该策略同时降低D $ -MR(通过高关联性)和D $ -HL(通过提高行缓冲区命中率)。为了进一步改善D $ -HL,我们提出了一种小型且低延迟的DRAM标签缓存(DTC)结构,该结构可以快速确定对DRAM缓存的访问是命中还是未命中。提议的故障诊断代码的性能取决于故障诊断代码命中率。为了增加它,我们提出了一种新颖的DTC插入策略,该策略还可以提高DTC命中率。我们在设计DRAM缓存层次结构时研究了延迟和未命中率的权衡,并分析了不同策略对整体性能的影响。我们评估了针对各种工作负载的策略,并将其性能与最近针对片上DRAM缓存的三个建议进行了比较。对于16核系统,与最先进的技术相比,我们的设置映射策略以及DTC及其自适应DTC插入策略将每个周期的谐波平均指令吞吐量提高了25.4%,15.5%和7.3%。 DRAM缓存命中/丢失预测所需的存储开销减少了55%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号