首页> 外文期刊>Operating systems review >Rx: Treating Bugs As Allergies— A Safe Method to Survive Software Failures
【24h】

Rx: Treating Bugs As Allergies— A Safe Method to Survive Software Failures

机译:Rx:将错误视为过敏-解决软件故障的安全方法

获取原文
获取原文并翻译 | 示例
           

摘要

Many applications demand availability. Unfortunately, software failures greatly reduce system availability. Prior work on surviving software failures suffers from one or more of the following limitations: Required application restructuring, inability to address deterministic software bugs, unsafe speculation on program execution, and long recovery time. This paper proposes an innovative safe technique, called Rx, which can quickly recover programs from many types of software bugs, both deterministic and non-deterministic. Our idea, inspired from allergy treatment in real life, is to rollback the program to a recent checkpoint upon a software failure, and then to re-execute the program in a modified environment. We base this idea on the observation that many bugs are correlated with the execution environment, and therefore can be avoided by removing the "allergen" from the environment. Rx requires few to no modifications to applications and provides programmers with additional feedback for bug diagnosis. We have implemented Rx on Linux. Our experiments with four server applications that contain six bugs of various types show that Rx can survive all the six software failures and provide transparent fast recovery within 0.017-0.16 seconds, 21-53 times faster than the whole program restart approach for all but one case (CVS). In contrast, the two tested alternatives, a whole program restart approach and a simple rollback and re-execution without environmental changes, cannot successfully recover the three servers (Squid, Apache, and CVS) that contain deterministic bugs, and have only a 40% recovery rate for the server (MySQL) that contains a non-deterministic concurrency bug. Additionally, Rx's checkpointing system is lightweight, imposing small time and space overheads.
机译:许多应用程序要求可用性。不幸的是,软件故障大大降低了系统可用性。尚待解决的软件故障的先前工作受到以下一个或多个限制:所需的应用程序重组,无法解决确定的软件错误,对程序执行的不安全推测以及较长的恢复时间。本文提出了一种称为Rx的创新安全技术,该技术可以从确定性和非确定性的多种类型的软件错误中快速恢复程序。我们的想法是从现实生活中的过敏处理启发而来的,是在软件出现故障时将程序回滚到最近的检查点,然后在经过修改的环境中重新执行程序。我们基于这样的观点:许多错误与执行环境相关联,因此可以通过从环境中删除“过敏原”来避免。 Rx几乎不需要修改应用程序,并且可以为程序员提供有关错误诊断的其他反馈。我们已经在Linux上实现了Rx。我们对包含六个类型的各种错误的四个服务器应用程序进行的实验表明,Rx可以在所有六个软件故障中幸免,并在0.017-0.16秒内提供透明的快速恢复,比除一种情况外的所有程序重新启动方法快21-53倍(CVS)。相比之下,这两个经过测试的替代方法是整个程序重新启动方法以及在不更改环境的情况下进行简单的回滚和重新执行,它们无法成功恢复包含确定性错误的三台服务器(Squid,Apache和CVS),并且只有40%包含非确定性并发错误的服务器(MySQL)的恢复率。另外,Rx的检查点系统重量轻,占用的时间和空间开销较小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号