首页> 外文期刊>IEEE Transactions on Computers >Teraflops supercomputer: architecture and validation of the fault tolerance mechanisms
【24h】

Teraflops supercomputer: architecture and validation of the fault tolerance mechanisms

机译:Teraflops超级计算机:容错机制的体系结构和验证

获取原文
获取原文并翻译 | 示例
           

摘要

Intel Corporation developed the Teraflops supercomputer for the US Department of Energy (DOE) as part of the Accelerated Strategic Computing Initiative (ASCI). This is the most powerful computing machine available today, performing over two trillion floating point operations per second with the aid of more than 9,000 Intel processors. The Teraflops machine employs complex hardware and software fault/error handling mechanisms for complying with DOE's reliability requirements. This paper gives a brief description of the system architecture and presents the validation of the fault tolerance mechanisms. Physical fault injection at the IC pin level was used for validation purposes. An original approach was developed for assessing signal sensitivity to transient faults and the effectiveness of the fault/error handling mechanisms. Dependency between fault/error detection coverage and fault duration was also determined. Fault injection experiments unveiled several malfunctions at the hardware, firmware, and software levels. The supercomputer performed according to the DOE requirements after corrective actions were implemented. The fault injection approach presented in this paper can be used for validation of any fault-tolerant or highly available computing system.
机译:英特尔公司为美国能源部(DOE)开发了Teraflops超级计算机,作为加速战略计算计划(ASCI)的一部分。这是当今功能最强大的计算机,借助9,000多个Intel处理器,每秒执行超过2万亿个浮点运算。 Teraflops机器采用复杂的硬件和软件故障/错误处理机制来满足DOE的可靠性要求。本文简要介绍了系统架构,并提出了容错机制的验证。 IC引脚级别的物理故障注入用于验证目的。开发了一种原始方法来评估信号对瞬态故障的敏感性以及故障/错误处理机制的有效性。还确定了故障/错误检测范围和故障持续时间之间的依赖性。故障注入实验揭示了硬件,固件和软件级别的几种故障。实施纠正措施后,超级计算机根据DOE要求执行。本文提出的故障注入方法可用于验证任何容错或高度可用的计算系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号