首页> 外文期刊>IEEE/ACM Transactions on Networking >A unified approach to fault-tolerance in communication protocols based on recovery procedures
【24h】

A unified approach to fault-tolerance in communication protocols based on recovery procedures

机译:基于恢复过程的通信协议容错的统一方法

获取原文
获取原文并翻译 | 示例

摘要

Discusses fault tolerance in computer communication protocols, modeled by communicating finite state machines, by providing an efficient algorithmic procedure for recovery in such systems. Even when the communication network is reliable and maintains the order of messages, any kind of transient error that may not be detected immediately could contaminate the system, resulting in protocol failure. To achieve fault-tolerance, the protocol must be able to detect the error, and then it must recover from that error and eventually reach a legal (or consistent) state, and resume its normal execution. A protocol that possesses the latter feature of recovering and continuing its execution starting from a legal state is also called a self-stabilizing protocol. Our recovery procedure does not require the application of an intrusive checkpointing procedure. The stable storage requirement for each process is less than that required for other proposed recovery procedures. The recovery procedure provides us with a legal protocol state, which is the global state before reaching any illegal state and before the effects of the error make other states illegal. Only a minimal number of processes affected by error propagation are required to rollback. Our recovery procedure can be used to recover from any number of transient errors in the system. Our recovery procedure has also been modeled in PROMELA, a language to describe validation models, which shows the syntactic correctness of our recovery protocol design. Finally, our procedure is compared with the existing approaches of handing the errors, and an illustrative example is provided.
机译:通过提供在此类系统中进行恢复的有效算法过程,讨论通过通信有限状态机建模的计算机通信协议中的容错能力。即使在通信网络可靠且保持消息顺序的情况下,任何可能无法立即检测到的瞬态错误也可能污染系统,从而导致协议故障。为了实现容错,协议必须能够检测到错误,然后必须从该错误中恢复并最终达到合法(或一致)状态,并恢复其正常执行。具有从合法状态开始恢复并继续执行的后一个功能的协议也称为自稳定协议。我们的恢复过程不需要应用侵入性检查点过程。每个过程的稳定存储要求都小于其他提议的恢复过程的要求。恢复过程为我们提供了一个合法的协议状态,这是在达到任何非法状态之前以及在错误影响使其他状态变为非法之前的全局状态。回滚仅需要最少数量受错误传播影响的进程。我们的恢复过程可用于从系统中任何数量的瞬时错误中恢复。我们的恢复程序也已在PROMELA(用于描述验证模型的语言)中建模,该语言显示了我们的恢复协议设计的语法正确性。最后,将我们的过程与现有的处理错误的方法进行比较,并提供了一个说明性示例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号