首页> 外文期刊>Microelectronics reliability >Novel lockstep-based fault mitigation approach for SoCs with roll-back and roll-forward recovery
【24h】

Novel lockstep-based fault mitigation approach for SoCs with roll-back and roll-forward recovery

机译:基于小型洛克斯特普的SOCS故障缓解方法,带回滚和前进恢复

获取原文
获取原文并翻译 | 示例
           

摘要

All-Programmable System-on-Chips (APSoCs) constitute a compelling option for employing applications in radiation environments thanks to their high-performance computing and power efficiency merits. Despite these advantages, APSoCs are sensitive to radiation like any other electronic device. Processors embedded in APSoCs, therefore, have to be adequately hardened against ionizing-radiation to make them a viable choice of design for harsh environments. This paper proposes a novel lockstep-based approach to harden the dual-core ARM CortexA9 processor in the Xilinx Zynq-7000 APSoC against radiation-induced soft errors by coupling it with a MicroBlaze TMR subsystem in the programmable logic (PL) layer of the Zynq. The proposed technique uses the concepts of checkpointing along with roll-back and roll-forward mechanisms at the software level, i.e. software redundancy, as well as processor replication and checker circuits at the hardware level (i.e. hardware redundancy). Results of fault injection experiments show that the proposed approach achieves high levels of protection against soft errors by mitigating around 98% of bit-flips injected into the register files of both ARM cores while keeping timing performance overhead as low as 25% if block and application sizes are adjusted appropriately. Furthermore, the incorporation of the roll-forward recovery operation in addition to the roll-back operation improves the Mean Workload between Failures (MWBF) of the system by up to approximate to 19% depending on the nature of the running application, since the application can proceed faster, in a scenario where a fault occurs, when treated with the roll-forward operation rather than roll-back operation. Thus, relatively more data can be processed before the next error occurs in the system.
机译:由于其高性能计算和功率效率优点,所有可编程系统上芯片(APSOCS)构成了在辐射环境中使用应用程序的引人注目的选择。尽管有这些优点,Apsocs对像任何其他电子设备一样对辐射敏感。因此,必须充分地硬化电离辐射的处理器,以使其成为恶劣环境的可行设计选择。本文提出了一种基于洛克斯特普的基于洛克斯特普的方法,以通过在Zynq的可编程逻辑(PL)层中的微勃朗TMR子系统中通过耦合Xilinx Zynq-7000 APSOC来硬化Xilinx Zynq-7000 APSOC的双芯ARM Cortexa9处理器。 。该提出的技术使用了在软件级别的回滚和前进机制以及软件冗余,以及硬件级别的处理器复制和检查器电路的概念(即硬件冗余)。故障注射实验结果表明,该方法通过减轻大约98%的位翻转到臂内核的寄存器文件中的大约98%的位翻转,而是达到软误差的高水平保护,同时保持时序性能超过25%,如果块和应用尺寸适当调整。此外,除了卷回运行之外还将辊向前恢复操作的结合提高了系统的故障(MWBF)之间的平均工作量,以便根据运行应用的性质,从应用程序的性质上近似为19%在使用前进操作的情况下而不是回滚操作时,可以更快地进行更快的情况。因此,在系统中发生下一次错误之前,可以在下次出错之前处理相对较多的数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号