...
首页> 外文期刊>Microelectronics reliability >Predicting and mitigating single-event upsets in DRAM using HOTH
【24h】

Predicting and mitigating single-event upsets in DRAM using HOTH

机译:使用HOTH预测和减轻DRAM中的单一事件扰乱

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

There is a growing demand for using commodity memory and storage solutions to make commercial aerospace ventures economically feasible. Existing radiation-hardened computer systems cannot meet this need alone. These hardened systems provide sufficient protection against the harsh environment of the upper atmosphere and low-Earth orbit, but require dramatically increased cost and utilize commercially out of date architectures and fabrication technologies. If new aerospace systems can take advantage of the latest commodity memories, they can leverage relevant advanced fabrication processes and the economy of scale to control costs. Of course, such systems would require new strategies to maintain appropriate tolerance and/or resilience to faults from the harsh environment. In this work, we observe that single-event effects (SEEs) in recent generation DRAM memories are not entirely random, and in fact are often highly predictable under neutron radiation bombardment. We demonstrate the existence of a small number of weak cells responsible for the vast majority of single-bit, SEEs. Based on this observation, we present a memory fault mapping and tolerance approach called HOTH to mitigate these predictable fault modes in conjunction with more random/unpredictable SEEs in DDR3 memory. In HOTH, both single- and multi-bit effects can be mitigated individually at runtime using a combination of existing error-correcting code techniques in Chipkill ECC and a fault map framework. The HOTH fault map is stored in the same DRAM that is subject to SEEs and leverages a fault-tolerance approach to mitigate SEEs that might appear in that part of the storage. Using data from different memory DIMMs, form factors, and radiation incidence angles we show that with HOTH we can improve uncorrectable fault rate by at least ten orders of magnitude and increase mean-time-to-failure to thousands of years, allowing extended service times in harsh environments.
机译:使用商品内存和存储解决方案的需求不断增长,使商业航空航天企业经济上可行。现有的辐射硬化计算机系统不能单独满足这种需求。这些硬化系统提供了足够的保护高层大气和低地轨道的恶劣环境,而是需要大幅增加的成本并利用商业上的日期架构和制造技术。如果新的航空航天系统可以利用最新的商品记忆,他们可以利用相关的先进制造工艺和规模经济来控制成本。当然,这种系统需要新的策略,以保持对恶劣环境的缺陷的适当宽容和/或弹性。在这项工作中,我们观察到最近一代DRAM存储器中的单一事件效应(看到)并不完全随机,实际上往往是在中子辐射轰炸下的高度可预测的。我们展示了少数弱细胞负责绝大多数单一的单一的细胞。基于此观察,我们介绍了一个名为Hoth的内存故障映射和公差方法,以便在DDR3内存中的更随机/不可预测的看到这些可预测的故障模式。在Hoth中,可以使用Chipkill ECC中的现有纠错码技术和故障映射框架的现有纠错码技术的组合在运行时单独减轻单一和多位效果。 Hoth故障映射存储在具有所看到的相同的DRAM中,并利用可能出现在存储器的该部分中可能出现的缓解方法。使用来自不同内存DIMM的数据,表单因子和辐射入射角,我们表明,随着Hoth,我们可以通过至少十个数量级来提高不可纠正的故障率,并将平均故障增加到数千年,允许扩展服务时间在恶劣的环境中。

著录项

  • 来源
    《Microelectronics reliability》 |2021年第2期|114024.1-114024.12|共12页
  • 作者单位

    Department of Computer Science University of Pittsburgh United States of America;

    Department of Computer Science University of Pittsburgh United States of America;

    Department of Electrical and Computer Engineering University of Pittsburgh United States of America;

    Department of Computer Science University of Pittsburgh United States of America;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    DRAM; Fault map; Memory reliability; Radiation test;

    机译:DRAM;故障图;记忆可靠性;辐射测试;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号