【24h】

RaaS: Resilience as a Service

机译:RaaS:弹性即服务

获取原文

摘要

Cloud computing is continuously increasing its popularity as key features such as scalability, pay-per-use and availability continue to evolve. It is also becoming a competitive platform for running high performance computing (HPC) and parallel applications due to the increasing performance of virtualized, highly-available instances. However, migrating HPC applications to cloud still requires native fault-tolerant solutions to fully leverage cloud features and maximize the resource utilization at the best cost - particularly for long-running parallel applications where faults can cause invalid states or data loss. This requires re-executing applications which increases completion time and cost. We propose Resilience as a Service (RaaS), a fault tolerant framework for HPC applications running in cloud. In this paper RADIC architecture (Redundant Array of Distributed Independent Fault Tolerance Controllers) is used to provide clouds with a highly available, distributed and scalable fault-tolerant service. The paper explores how traditional HPC protection and recovery mechanisms must be redesigned to natively leverage cloud properties and its multiple alternatives for implementing rollback recovery protocols using virtual machines, containers, object and block storage or database services. Results show that RaaS restores and completes the application execution using available resources while reducing overhead up to 8% for different fault-tolerant configuration alternatives.
机译:随着可扩展性,按使用量付费和可用性等关键功能的不断发展,云计算正在不断普及。由于虚拟化,高可用性实例的性能不断提高,它也已成为运行高性能计算(HPC)和并行应用程序的竞争平台。但是,将HPC应用程序迁移到云仍然需要本机容错解决方案,才能以最佳成本充分利用云功能并最大程度地利用资源,尤其是对于长时间运行的并行应用程序,在这些应用程序中,故障可能导致无效状态或数据丢失。这需要重新执行应用程序,从而增加了完成时间和成本。我们提出弹性即服务(RaaS),这是在云中运行的HPC应用程序的容错框架。在本文中,RADIC体系结构(分布式独立容错控制器的冗余阵列)用于为云提供高可用性,分布式和可伸缩的容错服务。本文探讨了如何重新设计传统的HPC保护和恢复机制,以原生地利用云属性及其多种替代方案,以使用虚拟机,容器,对象和块存储或数据库服务来实现回滚恢复协议。结果表明,RaaS使用可用资源还原并完成了应用程序的执行,同时针对不同的容错配置替代方案将开销减少了高达8%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号