A multilevel fault model for integrated parallel fault-tolerant systems

Bernhard Fechner

首页> 外文期刊>Concurrency and computation: practice and experience >A multilevel fault model for integrated parallel fault-tolerant systems

【24h】

A multilevel fault model for integrated parallel fault-tolerant systems

机译：集成并行容错系统的多级故障模型

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The appearance of multithreaded, multicore, and manycore systems has led to a performance leap. Such systems are denoted as integrated, when there are electrical and physical dependencies between different functional units, that is, multiple cores integrated on a single die. Typically, such systems have a common, shared interface to the outside world, bearing the potential of a single point of failure. In this work, several questions concerning fault propagation shall be tackled. First, if one component within a core fails, how likely is a faulty behavior of other components on the same or other cores? Second, what is the overall reliability of such a system? It is important to answer these questions prior to an implementation, because the total costs of a reliable product shall be as small as possible. Our approach combines different abstraction levels in one multilevel fault model. The first stage is the physical level, covering the physical effects of a fault. Validation on this level can be omitted, if the modeling is precise enough. The second stage is a component and routing model where current is represented as logic value. The last level is the behavioral modeling of components by finite state machines. Because of the different number and nature of existing parallel systems, a theoretical approach is followed. The model can cover the whole range of parallel devices from field programmable gate arrays to multicore CPUs and manycore graphics processing units. Therefore, it can help to improve the reliability of current and future parallel fault-tolerant systems by identifying the underlying bottlenecks. The function of the model is exemplarily shown by applying it to a field programmable gate array, identifying switchboxes as the main reliability bottleneck.

机译：多线程，多核和多核系统的出现导致性能飞跃。当不同功能单元之间存在电气和物理相关性时，即在单个管芯上集成了多个内核时，此类系统称为集成。通常，此类系统具有与外界的公共共享接口，具有单点故障的可能性。在这项工作中，应解决有关故障传播的几个问题。首先，如果核心中的一个组件发生故障，同一核心或其他核心上其他组件的错误行为发生的可能性有多大？第二，这种系统的整体可靠性如何？在实施之前必须回答这些问题，因为可靠产品的总成本应尽可能小。我们的方法在一个多级故障模型中组合了不同的抽象级别。第一阶段是物理级别，涵盖故障的物理影响。如果建模足够精确，则可以省略此级别的验证。第二阶段是组件和路由模型，其中电流表示为逻辑值。最后一级是通过有限状态机对组件进行行为建模。由于现有并行系统的数量和性质不同，因此采用了一种理论方法。该模型可以覆盖从现场可编程门阵列到多核CPU和多核图形处理单元的整个并行设备范围。因此，它可以通过识别潜在的瓶颈来帮助提高当前和将来的并行容错系统的可靠性。该模型的功能通过将其应用于现场可编程门阵列并确定开关箱为主要可靠性瓶颈而得到示例性展示。

著录项

来源
《Concurrency and computation: practice and experience》 |2012年第7期|p.687-698|共12页
作者
Bernhard Fechner;
展开▼
作者单位

Department of Systems and Networking, University of Augsburg, Universitdtsstr. 6a, 86159 Augsburg, Germany,Bernhard Fechner, Department of Systems and Networking, University of Augsburg, Universitatsstr.6a, 86159 Augsburg, Germany;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Integrated Design of Fault-Tolerant Control for Nonlinear Systems Based on Fault Estimation and T–S Fuzzy Modeling [J] . Jianglin Lan, Ron J. Patton IEEE Transactions on Fuzzy Systems . 2017,第5期

机译：基于故障估计和TS模糊建模的非线性系统容错控制集成设计
2. Integrated fault-tolerant control system design based on continuous model predictive control for longitudinal manoeuvre of hypersonic vehicle with actuator faults [J] . Hu Chaofang, Wei Xiaofang, Cao Lei, Control Theory & Applications, IET . 2020,第13期

机译：基于连续模型预测控制的集成容错控制系统设计，对致动器断层的纵向机动纵向机动
3. Integrated multiple-model adaptive fault identification and reconfigurable fault-tolerant control for Lead-Wing close formation systems [J] . Liu Chun, Jiang Bin, Zhang Ke International journal of systems science . 2018,第1a4期

机译：Lead-Wing闭合编队系统的集成多模型自适应故障识别和可重构容错控制
4. Submodules Fault-Tolerant Control and Analysis of Modular Multilevel Converter with Integrated Battery Energy Storage System [C] . Zuyao Ze, Hua Lin, Yajun Ma, European Conference on Power Electronics and Applications . 2018

机译：带有集成电池储能系统的模块化多电平转换器子模块的容错控制和分析
5. Integrated fault detection and isolation and fault-tolerant control of nonlinear process systems. [D] . McFall, Charles W. 2008

机译：非线性过程系统的集成故障检测，隔离和容错控制。
6. Rigorously modeling self-stabilizing fault-tolerant circuits: An ultra-robust clocking scheme for systems-on-chip [O] . Danny Dolev, Matthias Függer, Markus Posch, -1

机译：严格建模自稳定的容错电路：片上系统的超鲁棒时钟方案
7. Integrated Design of Fault-Tolerant Control for Nonlinear Systems Based on Fault Estimation and T–S Fuzzy Modeling [O] . Jianglin Lan, Ron J. Patton 2017

机译：基于故障估计和T-S模糊建模的非线性系统容错控制容错控制的集成设计
8. Integrated Modeling and Analysis of Fault-Tolerant Systems with Fault-Tolerant Software [R] . Collins, A. S. 1980

机译：具有容错软件的容错系统集成建模与分析

A multilevel fault model for integrated parallel fault-tolerant systems

摘要

著录项

相似文献

相关主题

期刊订阅