...
首页> 外文期刊>IEEE Transactions on Computers >Energy-Efficient Permanent Fault Tolerance in Hard Real-Time Systems
【24h】

Energy-Efficient Permanent Fault Tolerance in Hard Real-Time Systems

机译:硬实时系统中的节能永久性容错

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Triple Modular Redundancy (TMR) is a historical and long-time-used approach for masking various kinds of faults. By employing redundancy and analyzing the results of three separate executions of the same program, TMR is able to attain excellent levels of reliability. While TMR provides a desirable level of reliability, it suffers from the high power consumption of the redundant hardware, a severe detriment to its broad adoption. The energy consumption of TMR can be mitigated if its operations are divided into two stages, and one stage is dropped in the absence of fault. Such an approach, which is evaluated in recent research, however, quickly fails in the presence of permanent faults, as we show in this paper. In this work, we introduce Reactive TMR, a novel energy-efficient approach for tolerating both transient and permanent faults. The key idea is to detect and deactivate faulty components and re-assign their tasks to functioning ones. Using a combination of static scheduling and dynamic task-management, our method decouples tasks from cores that are susceptible to result in a faulty execution; hence, it instinctively tolerates permanent faults and improves both reliability and energy-efficiency. Through a detailed evaluation, we show that our proposal reduces the energy consumption of baseline TMR by 30 percent while preserving its reliability. As compared to the state-of-the-art proposal for TMR, our method, while maintaining the energy consumption, augments hard-fault-tolerance to the system.
机译:三重模块化冗余(TMR)是一种追踪各种故障的历史和长期使用的方法。通过采用冗余和分析相同程序的三个单独执行的结果,TMR能够获得优异的可靠性水平。虽然TMR提供了理想的可靠性,但它遭受了冗余硬件的高功耗,造成了严重的损害与其广泛的采用。如果其操作分为两个阶段,则可以减轻TMR的能量消耗,并且在没有故障的情况下丢弃一个阶段。然而,在最近的研究中评估的这种方法在存在永久性故障的情况下,在本文中显示出在永久性故障的情况下迅速失败。在这项工作中,我们引入了反应性TMR,这是一种用于容忍瞬态和永久性故障的新型节能方法。关键的想法是检测和取消激活故障的组件并将其任务重新分配给运行零件。使用静态调度和动态任务管理的组合,我们的方法从易受困难的核心中解耦任务,这些核心可以导致错误的执行;因此,它本能地容忍永久性故障并提高可靠性和能效。通过详细的评估,我们表明我们的提案将基线TMR的能耗降低了30%,同时保持其可靠性。与TMR的最先进的提案相比,我们的方法,同时保持能耗,增强了对系统的硬容错。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号