【24h】

Microreboot-A Technique for Cheap Recovery

机译:Microreboot-一种廉价恢复的技术

获取原文
获取原文并翻译 | 示例

摘要

A significant fraction of software failures in large-scale Internet systems are cured by rebooting, even when the exact failure causes are unknown. However, rebooting can be expensive, causing nontrivial service disruption or downtime even when clusters and failover are employed. In this work we use separation of process recovery from data recovery to enable microrebooting - a fine-grain technique for surgically recovering faulty application components, without disturbing the rest of the application. We evaluate microrebooting in an Internet auction system running on an application server. Microreboots recover most of the same failures as full reboots, but do so an order of magnitude faster and result in an order of magnitude savings in lost work. This cheap form of recovery engenders a new approach to high availability: microreboots can be employed at the slightest hint of failure, prior to node failover in multi-node clusters, even when mistakes in failure detection are likely; failure and recovery can be masked from end users through transparent call-level retries; and systems can be rejuvenated by parts, without ever being shut down.
机译:即使重新启动的确切原因尚不清楚,大型互联网系统中的大部分软件故障也可以通过重新启动来解决。但是,重新启动的成本可能很高,即使使用群集和故障转移,也会导致不小的服务中断或停机时间。在这项工作中,我们将流程恢复与数据恢复分离开来,以实现微重启(microreboots),这是一种用于通过手术恢复故障应用程序组件的细粒度技术,而不会干扰应用程序的其余部分。我们评估在应用程序服务器上运行的Internet拍卖系统中的微重启。微重启可以恢复与完全重启相同的大多数故障,但是这样做的速度要快一个数量级,从而可以节省大量工作量。这种廉价的恢复方式带来了一种新的高可用性方法:即使在故障检测中很可能出现错误,也可以在多节点群集中的节点故障转移之前,以微弱的故障提示使用微重启。最终用户可以通过透明的呼叫级重试来掩盖故障和恢复;而且系统可以被零件恢复活力,而无需关闭系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号