首页> 外文期刊>Security and Communications Networks >CDMCR: multi-level fault-tolerant system for distributed applications in cloud
【24h】

CDMCR: multi-level fault-tolerant system for distributed applications in cloud

机译:CDMCR:用于云中分布式应用程序的多级容错系统

获取原文
获取原文并翻译 | 示例

摘要

Cloud provides users with a new model of utilizing the computing infrastructure with the ability to perform parallel and distributed computations using elastic virtual cluster. However, the multi-level and complex features make cloud computing system more prone to failure. In this paper, we present a multi-level fault-tolerant system for distributed applications in cloud named Distributed-application oriented Multi-level Checkpoint/Restart for Cloud (CDMCR). The CDMCR system backups the complete state of applications periodically with a snapshot-based distributed checkpointing protocol, including file system state. Thus, we cannot only recover processes but also rollback data. A multi-level recovery strategy is proposed, which includes process-level recovery, virtual machine recreation, and host rescheduling, enabling comprehensive and efficient fault tolerance for different components in cloud. We deploy CDMCR as PaaS, so that users can be liberated from node management and system configuration and get access to fault-tolerant service conveniently. We have implemented this system based on the Xen virtualization platform and the OpenNebula cloud platform. Experiments on the prototype demonstrate the correctness of the system. Analysis shows that CDMCR does not cause message loss or data loss, and the backup time remains nearly constant as the number of nodes increases on virtual cluster. Copyright (c) 2015 John Wiley & Sons, Ltd.
机译:云为用户提供了一种利用计算基础架构的新模型,该模型具有使用弹性虚拟集群执行并行和分布式计算的能力。但是,多层和复杂的功能使云计算系统更容易出现故障。在本文中,我们提出了一种面向云中分布式应用程序的多级容错系统,称为面向分布式应用程序的云多级检查点/重启(CDMCR)。 CDMCR系统使用基于快照的分布式检查点协议定期备份应用程序的完整状态,包括文件系统状态。因此,我们不仅可以恢复进程,还可以回滚数据。提出了一种多级恢复策略,其中包括进程级恢复,虚拟机重新创建和主机重新安排,从而为云中的不同组件提供了全面而有效的容错能力。我们将CDMCR部署为PaaS,以便可以从节点管理和系统配置中解放用户,并方便地访问容错服务。我们已经基于Xen虚拟化平台和OpenNebula云平台实施了该系统。原型实验证明了该系统的正确性。分析表明CDMCR不会导致消息丢失或数据丢失,并且随着虚拟群集上节点数量的增加,备份时间几乎保持不变。版权所有(c)2015 John Wiley&Sons,Ltd.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号