首页> 外文期刊>Microelectronics & Reliability >Correctable and uncorrectable errors using large scale DRAM DIMMs in replacement network servers
【24h】

Correctable and uncorrectable errors using large scale DRAM DIMMs in replacement network servers

机译:在备用网络服务器中使用大规模DRAM DIMM时可纠正和不可纠正的错误

获取原文
获取原文并翻译 | 示例
           

摘要

This paper investigated DRAM DIMM errors using field records in replacement network servers. Large DRAM samples of about 40 K were collected over a 2.5 years period from 23 different server types, included various DIMMs from three different DRAM manufacturers with densities between 4 and 128 GB, and speeds between 1066 and 2400 Mbps. Errors that occurred during system operation were classified as either correctable (CE) or uncorrectable (UE) errors based on error correction code (ECC) schemes built into the servers. Of the collected DIMMS, 24% had recorded errors, where CE-only, UE-only, and UE and CE together comprised 28%, 43%, and 29% of recorded errors, respectively. Since UEs can cause large-scale failures, systems are replaced upon any UE occurrence. Approximately half UE-only DIMMs had 1 UE error. In contrast, many DIMMs had billions of CE errors, where a faulty location may be repetitively accessed. Such drastic differences in UE and CE counts help explain the importance of ECC and error mitigation schemes. Comparative analyses of errors were made over the manufacturers and operating speeds. After reasonable adjustments for repetitive counts of errors, failure in time (FIT) differences were up to 38% over manufacturers. Higher speed DIMMs generally had higher FIT with 2400 Mbps DIMMs exhibiting 6.7 times FIT of 1066 Mbps DIMMs.
机译:本文使用替换网络服务器中的现场记录调查了DRAM DIMM错误。在2.5年内,从23种不同的服务器类型中收集了大约40 K的大型DRAM样本,其中包括来自三个不同DRAM制造商的各种DIMM,其密度在4到128 GB之间,速度在1066到2400 Mbps之间。根据服务器中内置的纠错码(ECC)方案,系统操作期间发生的错误分为可纠正(CE)或不可纠正(UE)错误。在收集的DIMM中,有24%记录了错误,其中仅CE,仅UE,UE和CE分别占记录错误的28%,43%和29%。由于UE可能导致大规模故障,因此在发生任何UE时都会更换系统。大约一半的仅UE的DIMM出现1个UE错误。相反,许多DIMM都有数十亿个CE错误,可能会重复访问错误的位置。 UE和CE数量的这种巨大差异有助于说明ECC和错误缓解方案的重要性。比较了错误的制造商和操作速度。在对重复的错误计数进行合理的调整之后,时间故障(FIT)差异比制造商高38%。更高速度的DIMM通常具有更高的FIT,其中2400 Mbps DIMM的FIT是1066 Mbps DIMM的6.7倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号