首页> 外文期刊>IEEE Transactions on Computers >Dependability analysis of a high-speed network using software-implemented fault injection and simulated fault injection
【24h】

Dependability analysis of a high-speed network using software-implemented fault injection and simulated fault injection

机译:使用软件实现的故障注入和模拟故障注入的高速网络可靠性分析

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a dependability study of high-speed, switched Local Area Networks (LANs) using Myrinet as an example testbed (with theoretical speeds of 2.56 Gbps). The study uses results of two fault injection methods, simulated fault injection and software-implemented fault injection (SWIFI), to analyze the application-level impact of transient faults injected into the network interface hardware. These results include a number of errors, such as dropped or corrupt messages, host interface or host resets, and local or remote host interface hangs. The paper presents the study in two parts: First, the results from the SWIFI method in the real system are used as a basis to validate the simulation and identify the major factors leading to differences between the methods. A comparison between the two injection methods shows that they agree for 83 percent of the fault injections. The results, however, vary greatly, depending on the fault type considered. The study also presents an analysis of the effects of varying workload intensity, host platform, and interface function targeted by the injection. An example of this analysis is to show that the function targeted has a significant impact on the fault activation rate. Finally, the study identifies two mechanisms by which faults may propagate from the interface to other parts of the network; in one example, this propagation caused the interface's host computer to reboot, while another caused a remote interface in the network to hang.
机译:本文以Myrinet作为示例测试台(理论速度为2.56 Gbps),对高速交换局域网(LAN)进行了可靠性研究。该研究使用两种故障注入方法(模拟故障注入和软件实现的故障注入(SWIFI))的结果来分析注入网络接口硬件的瞬态故障对应用程序级别的影响。这些结果包括许多错误,例如消息丢失或损坏,主机接口或主机重置以及本地或远程主机接口挂起。本文分为两个部分进行研究:第一,将实际系统中SWIFI方法的结果用作验证仿真并确定导致方法之间差异的主要因素的基础。两种注入方法之间的比较表明,他们同意83%的断层注入。但是,结果会因所考虑的故障类型而有很大差异。这项研究还提出了对注射目标不同的工作量强度,主机平台和界面功能的影响的分析。该分析的一个示例表明,目标功能对故障激活率具有重大影响。最后,研究确定了两种机制,通过这些机制,故障可能会从接口传播到网络的其他部分。在一个示例中,这种传播导致接口的主机重新启动,而另一个导致网络中的远程接口挂起。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号