首页> 外文会议>IEEE International Symposium on High Performance Computer Architecture >DUO: Exposing On-Chip Redundancy to Rank-Level ECC for High Reliability
【24h】

DUO: Exposing On-Chip Redundancy to Rank-Level ECC for High Reliability

机译:DUO:将片上冗余暴露于等级ECC以实现高可靠性

获取原文

摘要

DRAM row and column sparing cannot efficiently tolerate the increasing inherent fault rate caused by continued process scaling. In-DRAM ECC (IECC), an appealing alternative to sparing, can resolve inherent faults without significant changes to DRAM, but it is inefficient for highly-reliable systems where rank-level ECC (RECC) is already used against operational faults. In addition, DRAM design in the near future (possibly as early as DDR5) may transfer data in longer bursts, which complicates high-reliability RECC due to fewer devices being used per rank and increased fault granularity. We propose dual use of on-chip redundancy (DUO), a mech- anism that bypasses the IECC module and transfers on-chip redundancy to be used directly for RECC. Due to its increased redundancy budget, DUO enables a strong and novel RECC for highly-reliable systems, called DUO SDDC. The long codewords of DUO SDDC provide fundamentally higher detection and correction capabilities, and several novel secondary-correction techniques integrate together to further expand its correction capability. According to our evaluation results, DUO shows performance degradation on par with or better than IECC (average 2-3%), while consuming less DRAM energy than IECC (average 4-14% overheads). DUO provides higher reliability than either IECC or the state-of-the-art ECC technique. We show the robust reliability of DUO SDDC by comparing it to other ECC schemes using two different inherent fault-error models.
机译:DRAM行和列备用不能有效地承受由于持续的过程缩放而导致的固有故障率的增加。 DRAM中的ECC(IECC)是节省资源的替代方法,可以解决固有故障,而无需对DRAM进行重大更改,但是对于已经使用等级ECC(RECC)来应对操作故障的高度可靠的系统而言,效率低下。另外,在不久的将来(可能最早在DDR5上),DRAM设计可能会以更长的突发次数传输数据,由于每等级使用的设备数量减少并且故障粒度增加,因此高可靠性RECC变得更加复杂。我们建议双重使用片上冗余(DUO),该机制绕过IECC模块并转移片上冗余以直接用于RECC。由于增加了冗余预算,DUO为称为DUO SDDC的高度可靠的系统提供了强大而新颖的RECC。 DUO SDDC的长码字从根本上提供了更高的检测和校正能力,并且几种新颖的二次校正技术集成在一起以进一步扩展其校正能力。根据我们的评估结果,DUO的性能下降与IECC相同或更好(平均2-3%),而DRAM能耗却比IECC少(平均4-14%的开销)。与IECC或最新的ECC技术相比,DUO提供更高的可靠性。通过将DUO SDDC与使用两种不同的固有故障-错误模型的其他ECC方案进行比较,我们展示了DUO SDDC的强大可靠性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号