...
首页> 外文期刊>IEEE Transactions on Computers >Energy-Efficient Permanent Fault Tolerance in Hard Real-Time Systems
【24h】

Energy-Efficient Permanent Fault Tolerance in Hard Real-Time Systems

机译:硬实时系统中的高能效永久容错

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Triple Modular Redundancy (TMR) is a historical and long-time-used approach for masking various kinds of faults. By employing redundancy and analyzing the results of three separate executions of the same program, TMR is able to attain excellent levels of reliability. While TMR provides a desirable level of reliability, it suffers from the high power consumption of the redundant hardware, a severe detriment to its broad adoption. The energy consumption of TMR can be mitigated if its operations are divided into two stages, and one stage is dropped in the absence of fault. Such an approach, which is evaluated in recent research, however, quickly fails in the presence of permanent faults, as we show in this paper. In this work, we introduce Reactive TMR, a novel energy-efficient approach for tolerating both transient and permanent faults. The key idea is to detect and deactivate faulty components and re-assign their tasks to functioning ones. Using a combination of static scheduling and dynamic task-management, our method decouples tasks from cores that are susceptible to result in a faulty execution; hence, it instinctively tolerates permanent faults and improves both reliability and energy-efficiency. Through a detailed evaluation, we show that our proposal reduces the energy consumption of baseline TMR by 30 percent while preserving its reliability. As compared to the state-of-the-art proposal for TMR, our method, while maintaining the energy consumption, augments hard-fault-tolerance to the system.
机译:三重模块冗余(TMR)是用于屏蔽各种故障的历史悠久且长期使用的方法。通过采用冗余并分析同一程序的三个单独执行的结果,TMR能够获得出色的可靠性。尽管TMR提供了理想的可靠性水平,但它却遭受了冗余硬件的高功耗的困扰,这严重地不利于其广泛采用。如果将TMR的操作分为两个阶段,并且在没有故障的情况下将其降低一级,则可以减轻TMR的能耗。如我们在本文中所示,这种方法在最近的研究中得到了评估,但是在存在永久性断层的情况下很快失败了。在这项工作中,我们介绍了无功TMR,这是一种可耐受瞬态和永久性故障的新型节能方法。关键思想是检测并停用有故障的组件,然后将其任务重新分配给功能正常的组件。通过将静态调度和动态任务管理相结合,我们的方法将任务与容易导致执行错误的内核分离。因此,它本能地容忍永久性故障,并提高了可靠性和能效。通过详细评估,我们表明,我们的建议将基准TMR的能耗降低了30%,同时又保持了可靠性。与TMR的最新建议相比,我们的方法在保持能耗的同时,提高了系统的容错能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号