首页> 外文期刊>ACM Transactions on Architecture and Code Optimization >NUCA-L1: A Non-Uniform Access Latency Level-1 Cache Architecture for Multicores Operating at Near-Threshold Voltages
【24h】

NUCA-L1: A Non-Uniform Access Latency Level-1 Cache Architecture for Multicores Operating at Near-Threshold Voltages

机译:NUCA-L1:适用于在接近阈值电压下运行的多核的非统一访问延迟1级缓存架构

获取原文
获取原文并翻译 | 示例
           

摘要

Research has shown that operating in the near-threshold region is expected to provide up to 10x energy efficiency for future processors. However, reliable operation below a minimum voltage (Vccmin) cannot be guaranteed due to process variations. Because SRAM margins can easily be violated at near-threshold voltages, their bit-cell failure rates are expected to rise steeply. Multicore processors rely on fast private L1 caches to exploit data locality and achieve high performance. In the presence of high bit-cell fault rates, traditionally an L1 cache either sacrifices capacity or incurs additional latency to correct the faults. We observe that L1 cache sensitivity to hit latency offers a design trade-off between capacity and latency. When fault rate is high at extreme vccmin, it is beneficial to recover L1 cache capacity, even if it comes at the cost of additional latency. However, at low fault rates, the additional constant latency to recover cache capacity degrades performance. With this trade-off in mind, we propose a Non-Uniform Cache Access L1 architecture (NUCA-L1) that avoids additional latency on accesses to fault-free cache lines. To mitigate the capacity bottleneck, it deploys a correction mechanism to recover capacity at the cost of additional latency. Using extensive simulations of a 64-core multicore, we demonstrate that at various bit-cell fault rates, our proposed private NUCA-L1 cache architecture performs better than state-of-the-art schemes, along with a significant reduction in energy consumption.
机译:研究表明,在接近阈值区域内运行有望为未来的处理器提供高达10倍的能效。但是,由于工艺变化,不能保证在最小电压(Vccmin)以下的可靠操作。由于在接近阈值电压时很容易违反SRAM容限,因此预计它们的位单元故障率将急剧上升。多核处理器依靠快速的专用L1缓存来利用数据局部性并实现高性能。在存在高位单元故障率的情况下,传统上,L1高速缓存要么牺牲容量,要么招致额外的等待时间以纠正故障。我们观察到L1缓存对命中延迟的敏感性在容量和延迟之间提供了设计权衡。当在极高的vccmin下故障率很高时,恢复L1缓存容量是有益的,即使这样做会增加额外的延迟。但是,在低故障率下,用于恢复缓存容量的额外恒定等待时间会降低性能。考虑到这一权衡,我们提出了一种非统一的缓存访问L1体系结构(NUCA-L1),该体系结构避免了对无故障缓存行的访问带来的额外延迟。为了缓解容量瓶颈,它部署了一种校正机制来以额外的延迟为代价来恢复容量。通过对64核多核的广泛仿真,我们证明了在各种位单元故障率下,我们提出的专用NUCA-L1缓存体系结构的性能优于最新方案,并且显着降低了能耗。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号