首页> 外文期刊>IEEE Transactions on Computers >Improving 3D DRAM Fault Tolerance Through Weak Cell Aware Error Correction
【24h】

Improving 3D DRAM Fault Tolerance Through Weak Cell Aware Error Correction

机译:通过弱单元感知错误纠正提高3D DRAM容错能力

获取原文
获取原文并翻译 | 示例
           

摘要

Although the emerging 3D DRAM products can significantly improve the computing system performance, the relatively high cost is one of the most critical issues that prevent their wide real-life adoption. Intuitively, a strong memory fault tolerance can be leveraged to reduce the fabrication cost of DRAM dies, and the total cost will reduce if the fabrication cost saving can off-set the cost overhead of memory fault tolerance. Nevertheless, such a simple concept can be a practically viable option only for 3D DRAM because: (1) The stacked logic die can solely implement memory fault tolerance inside 3D DRAM chips, obviating any changes on the host CPUs and CPU-DRAM interfaces. (2) With the total ownership of both the logic die and DRAM dies inside 3D DRAM chips, DRAM manufacturers can fully exploit the potential to truly minimize the 3D DRAM bit cost. Following this intuition, we developed a 3D DRAM fault tolerance design strategy. It can achieve a very strong tolerance to weak DRAM cells at very small redundancy and latency overhead. The key is to cohesively leverage the detectability of weak cells and runtime configurability of error correction code (ECC) decoding. In addition, this design strategy can gracefully embrace the inaccuracy of weak cell detection (e.g., weak cell miss-detection and false-detection). We carried out thorough mathematical analysis, and the results show that, under the redundancy overhead of 1:8 (same as today’s ECC DIMM), this design strategy can tolerate the weak cell rate of as high as 10−4 and 6 ×10−5 if 100 and 90 percent of all the weak cells are known in prior. Using Micron’s hybrid memory cube (HMC) 3D DRAM chips as the test vehicle, we evaluated the implementation cost and the results show that it only consumes less than 0.4 mm2 (45 nm node) on the logic die. Using CPU and DRAM simulators, we further carried out simulations over a variety of computing benchmarks and the results show that this design solution only incurs less than 2 percent performance degradation on average.
机译:尽管新兴的3D DRAM产品可以显着提高计算系统性能,但是相对较高的成本却是阻碍其在现实生活中广泛采用的最关键问题之一。直观地讲,可以利用强大的存储器容错能力来降低DRAM管芯的制造成本,如果节省的制造成本可以抵消存储器容错的成本开销,则总成本将降低。但是,这样一个简单的概念仅对3D DRAM才是可行的选择,因为:(1)堆叠逻辑芯片可以仅实现3D DRAM芯片内部的存储器容错功能,从而避免了主机CPU和CPU-DRAM接口的任何变化。 (2)借助3D DRAM芯片内部逻辑芯片和DRAM芯片的全部所有权,DRAM制造商可以充分利用潜力来真正降低3D DRAM的位成本。按照这种直觉,我们开发了3D DRAM容错设计策略。它可以以非常小的冗余和等待时间开销实现对弱DRAM单元的非常强的容忍度。关键是凝聚地利用弱单元的可检测性和纠错码(ECC)解码的运行时可配置性。另外,该设计策略可以优雅地包含弱小区检测的不准确性(例如,弱小区漏检和错误检测)。我们进行了深入的数学分析,结果表明,在1:8的冗余开销下(与今天的ECC DIMM相同),该设计策略可以承受高达10−4和6×10−的弱信元速率。如果所有弱单元中的100%和90%是事先已知的,则为5。我们使用美光的混合存储立方体(HMC)3D DRAM芯片作为测试工具,我们评估了实施成本,结果表明,该芯片在逻辑裸片上的功耗仅不到0.4 mm2(节点为45 nm)。使用CPU和DRAM仿真器,我们进一步对各种计算基准进行了仿真,结果表明,该设计解决方案平均只会导致性能下降不到2%。

著录项

  • 来源
    《IEEE Transactions on Computers》 |2017年第5期|820-833|共14页
  • 作者单位

    Department of Electrical, Computer and System Engineering, Rensselaer Polytechnic Institute, Troy, NY;

    Department of Electrical, Computer and System Engineering, Rensselaer Polytechnic Institute, Troy, NY;

    Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, Shaanxi, China;

    Department of Electrical, Computer and System Engineering, Rensselaer Polytechnic Institute, Troy, NY;

    Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, Shaanxi, China;

    Department of Electrical, Computer and System Engineering, Rensselaer Polytechnic Institute, Troy, NY;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Error correction codes; Three-dimensional displays; Redundancy; Fault tolerant systems; DRAM chips;

    机译:纠错码;三维显示;冗余;容错系统;DRAM芯片;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号