【24h】

ReHype: Enabling VM Survival Across Hypervisor Failures

机译:ReHype:跨虚拟机管理程序故障启用虚拟机生存

获取原文
获取原文并翻译 | 示例

摘要

With existing virtualized systems, hypervisor failures lead to overall system failure and the loss of all the work in progress of virtual machines (VMs) running on the system. We introduce ReHype, a mechanism for recovery from hypervisor failures by booting a new instance of the hypervisor while preserving the state of running VMs. VMs are stalled during the hypervisor reboot and resume normal execution once the new hypervisor instance is running. Hypervisor failures can lead to arbitrary state corruption and inconsistencies throughout the system. ReHype deals with the challenge of protecting the recovered hypervisor instance from such corrupted state and resolving inconsistencies between different parts of hypervisor state as well as between the hypervisor and VMs and between the hypervisor and the hardware. We have implemented ReHype for the Xen hypervisor. The implementation was done incrementally, using results from fault injection experiments to identify the sources of dangerous state corruption and inconsistencies. The implementation of ReHype involved only 880 LOC added or modified in Xen. The memory space overhead of ReHype is only 2.1MB for a pristine copy of the hypervisor code and static data plus a small reserved memory area. The fault injection campaigns used to evaluate the effectiveness of ReHype involved a system with multiple VMs running I/O and hypercall-intensive benchmarks. Our experimental results show that the ReHype prototype can successfully recover from over 90% of detected hypervisor failures.
机译:对于现有的虚拟系统,系统管理程序故障会导致整体系统故障,并导致系统上运行的虚拟机(VM)正在进行的所有工作丢失。我们介绍了ReHype,它是一种通过引导虚拟机管理程序的新实例同时保留虚拟机运行状态来从虚拟机管理程序故障中恢复的机制。虚拟机在虚拟机管理程序重新引导期间被停止,并在新的虚拟机管理程序实例运行后恢复正常执行。系统管理程序故障可能导致整个系统中任意状态的损坏和不一致。 ReHype面临着以下挑战:保护恢复的虚拟机管理程序实例免受此类损坏状态的影响,并解决虚拟机管理程序状态的不同部分之间以及虚拟机管理程序与VM之间以及虚拟机管理程序与硬件之间的不一致问题。我们已经为Xen虚拟机管理程序实现了ReHype。使用故障注入实验的结果来确定危险状态损坏和不一致的根源,从而逐步完成了实施。 ReHype的实施仅涉及在Xen中添加或修改的880 LOC。 ReHype的存储空间开销仅为虚拟机管理程序代码和静态数据的原始副本以及较小的保留存储区的2.1MB。用于评估ReHype有效性的故障注入活动涉及一个系统,该系统具有多个运行I / O和超呼叫密集型基准测试的VM。我们的实验结果表明,ReHype原型可以成功地从超过90%的检测到的管理程序故障中恢复。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号