【24h】

Fault tolerance via N-modular software redundancy

机译:通过N模块化软件冗余实现容错

获取原文

摘要

Presents a novel method of "indirect" software instrumentation to achieve fault tolerance at the application level. Error detection and recovery are based on the well-known approach of replicating application processes on multiple computers in a network. The advantages of this fault tolerance scheme based on indirect instrumentation include: (1) a general error detection method that ensures data integrity for critical data without the need for any modification of the code, (2) a high degree of automation and transparency for fault-tolerant configuration and operation (i.e. the set-up time for a new application is on the order of a few minutes), and (3) the ability to perform error detection for applications for which no source code or only minimal knowledge of the code is available, including legacy applications. The types of faults that are tolerated include transient and permanent hardware faults on a single machine and certain types of application and operating system software faults.
机译:提出了一种新的“间接”软件检测方法,以在应用程序级别上实现容错能力。错误检测和恢复基于众所周知的在网络中的多台计算机上复制应用程序过程的方法。这种基于间接检测的容错方案的优点包括:(1)一种通用的错误检测方法,可确保关键数据的数据完整性,而无需修改任何代码;(2)高度的自动化和故障透明性-宽容的配置和操作(即,新应用程序的建立时间为几分钟左右),以及(3)对没有源代码或仅对代码知之甚少的应用程序执行错误检测的能力可用,包括旧版应用程序。允许的故障类型包括一台机器上的瞬时和永久性硬件故障,以及某些类型的应用程序和操作系统软件故障。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号