首页> 外文期刊>Device and Materials Reliability, IEEE Transactions on >Assessment of the Impact of Cosmic-Ray-Induced Neutrons on Hardware in the Roadrunner Supercomputer
【24h】

Assessment of the Impact of Cosmic-Ray-Induced Neutrons on Hardware in the Roadrunner Supercomputer

机译:评估宇宙射线诱发的中子对Roadrunner超级计算机中的硬件的影响

获取原文
获取原文并翻译 | 示例

摘要

Microprocessor-based systems are a common design for high-performance computing (HPC) platforms. In these systems, several thousands of microprocessors can participate in a single calculation that may take weeks or months to complete. When used in this manner, a fault in any of the microprocessors could cause the computation to crash or cause silent data corruption (SDC), i.e., computationally incorrect results that originate from an undetected fault. In recent years, neutron-induced effects in HPC hardware have been observed, and researchers have started to study how neutrons impact microprocessor-based computations. This paper presents results from an accelerated neutron-beam test focusing on two microprocessors used in Roadrunner, which is the first petaflop supercomputer. Research questions of interest include whether the application running affects neutron susceptibility and whether different replicates of the hardware under test have different susceptibilities to neutrons. Estimated failures in time for crashes and for SDC are presented for the hardware under test, for the Triblade servers used for computation in Roadrunner, and for Roadrunner.
机译:基于微处理器的系统是高性能计算(HPC)平台的常见设计。在这些系统中,数千个微处理器可以参与一次计算,这可能需要数周或数月才能完成。当以这种方式使用时,任何微处理器中的故障都可能导致计算崩溃或导致静默数据损坏(SDC),即,由于未检测到的故障而导致的计算错误结果。近年来,已观察到中子在HPC硬件中引起的影响,研究人员已开始研究中子如何影响基于微处理器的计算。本文介绍了加速中子束测试的结果,该测试集中于Roadrunner中使用的两个微处理器,这是第一台petaflop超级计算机。感兴趣的研究问题包括应用程序运行是否会影响中子敏感性,以及被测硬件的不同副本是否对中子具有不同的敏感性。针对测试中的硬件,Roadrunner中用于计算的Triblade服务器以及Roadrunner,提供了估计的崩溃时间和SDC故障时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号