首页> 外文会议>International Symposium on Pervasive Systems, Algorithms and Networks;ISPAN 2012 >SAFER: System-level Architecture for Failure Evasion in Real-time Applications
【24h】

SAFER: System-level Architecture for Failure Evasion in Real-time Applications

机译:SAFER:实时应用中避免故障的系统级体系结构

获取原文

摘要

Recent trends towards increasing complexity in distributed embedded real-time systems pose challenges in designing and implementing a reliable system such as a self-driving car. The conventional way of improving reliability is to use redundant hardware to replicate the whole (sub)system. Although hardware replication has been widely deployed in hard real-time systems such as avionics, space shuttles and nuclear power plants, it is significantly less attractive to many applications because the amount of necessary hardware multiplies as the size of the system increases. The growing needs of flexible system design are also not consistent with hardware replication techniques. To address the needs of dependability through redundancy operating in real-time, we propose a layer called SAFER(System-level Architecture for Failure Evasion in Real-time applications) to incorporate configurable task-level fault-tolerance features to tolerate fail-stop processor and task failures for distributed embedded real-time systems. To detect such failures, SAFER monitors the health status and state information of each task and broadcasts the information. When a failure is detected using either time-based failure detection or event-based failure detection, SAFER reconfigures the system to retain the functionality of the whole system. We provide a formal analysis of the worst-case timing behaviors of SAFER features. We also describe the modeling of a system equipped with SAFER to analyze timing characteristics through a model-based design tool called SysWeaver. SAFER has been implemented on Ubuntu 10.04 LTS and deployed on Boss, an award-winning autonomous vehicle developed at Carnegie Mellon University. We show various measurements using simulation scenarios used during the 2007 DARPA Urban Challenge. Finally, we present a case study of failure recovery by SAFER when node failures are injected.
机译:分布式嵌入式实时系统中复杂性不断增长的最新趋势给设计和实现可靠的系统(如自动驾驶汽车)带来了挑战。提高可靠性的常规方法是使用冗余硬件来复制整个(子系统)。尽管硬件复制已广泛部署在航空电子设备,航天飞机和核电站等硬实时系统中,但由于许多必需的硬件会随着系统规模的增加而成倍增加,因此它对许多应用的吸引力大大降低。灵活的系统设计不断增长的需求也与硬件复制技术不一致。为了通过实时冗余操作来满足可靠性需求,我们提出了一个称为SAFER(实时应用中的系统规避故障级别的体系结构)的层,以结合可配置的任务级容错功能以容忍故障停止处理器和分布式嵌入式实时系统的任务失败。为了检测此类故障,SAFER监视每个任务的运行状况和状态信息,并广播该信息。使用基于时间的故障检测或基于事件的故障检测检测到故障时,SAFER会重新配置系统以保留整个系统的功能。我们提供SAFER功能的最坏情况下定时行为的形式分析。我们还将描述配备SAFER的系统的建模,以通过基于模型的设计工具SysWeaver分析时序特性。 SAFER已在Ubuntu 10.04 LTS上实现,并已部署在卡内基梅隆大学开发的屡获殊荣的自动驾驶汽车Boss上。我们使用在2007 DARPA城市挑战赛中使用的模拟场景展示了各种测量结果。最后,我们介绍了一个在注入节点故障时通过SAFER恢复故障的案例研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号