Fault-Tolerant Scheduling for Real-Time Scientific Workflows with Elastic Resource Provisioning in Virtualized Clouds

Xiaomin Zhu; Ji Wang; Hui Guo; Dakai Zhu; Laurence T. Yang; Ling Liu

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Fault-Tolerant Scheduling for Real-Time Scientific Workflows with Elastic Resource Provisioning in Virtualized Clouds

【24h】

Fault-Tolerant Scheduling for Real-Time Scientific Workflows with Elastic Resource Provisioning in Virtualized Clouds

机译：利用虚拟化云中的弹性资源供应进行实时科学工作流的容错调度

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Clouds are becoming an important platform for scientific workflow applications. However, with many nodes being deployed in clouds, managing reliability of resources becomes a critical issue, especially for the real-time scientific workflow execution where deadlines should be satisfied. Therefore, fault tolerance in clouds is extremely essential. The PB (primary backup) based scheduling is a popular technique for fault tolerance and has effectively been used in the cluster and grid computing. However, applying this technique for real-time workflows in a virtualized cloud is much more complicated and has rarely been studied. In this paper, we address this problem. We first establish a real-time workflow fault-tolerant model that extends the traditional PB model by incorporating the cloud characteristics. Based on this model, we develop approaches for task allocation and message transmission to ensure faults can be tolerated during the workflow execution. Finally, we propose a dynamic fault-tolerant scheduling algorithm, FASTER, for real-time workflows in the virtualized cloud. FASTER has three key features: 1) it employs a backward shifting method to make full use of the idle resources and incorporates task overlapping and VM migration for high resource utilization, 2) it applies the vertical/horizontal scaling-up technique to quickly provision resources for a burst of workflows, and 3) it uses the vertical scaling-down scheme to avoid unnecessary and ineffective resource changes due to fluctuated workflow requests. We evaluate our FASTER algorithm with synthetic workflows and workflows collected from the real scientific and business applications and compare it with six baseline algorithms. The experimental results demonstrate that FASTER can effectively improve the resource utilization and schedulability even in the presence of node failures in virtualized clouds.

机译：云正在成为科学工作流程应用程序的重要平台。但是，由于许多节点都部署在云中，因此管理资源的可靠性成为一个关键问题，尤其是对于必须满足期限的实时科学工作流执行而言。因此，云中的容错能力至关重要。基于PB（主备份）的调度是一种流行的容错技术，已有效地用于集群和网格计算中。但是，将这种技术应用于虚拟化云中的实时工作流要复杂得多，并且很少进行研究。在本文中，我们解决了这个问题。我们首先建立一个实时工作流容错模型，该模型通过合并云特征来扩展传统的PB模型。基于此模型，我们开发了任务分配和消息传输方法，以确保在工作流执行期间可以容忍错误。最后，我们为虚拟化云中的实时工作流提出了动态容错调度算法FASTER。 FASTER具有三个关键功能：1）它采用后移方法来充分利用空闲资源，并结合了任务重叠和VM迁移以提高资源利用率； 2）它应用了垂直/水平扩展技术来快速配置资源3）使用垂直缩减方案，以避免由于工作流程请求波动而造成不必要和无效的资源更改。我们使用合成工作流程以及从实际科学和业务应用程序中收集的工作流程评估我们的FASTER算法，并将其与六种基准算法进行比较。实验结果表明，即使在虚拟化云中出现节点故障的情况下，FASTER也可以有效地提高资源利用率和可调度性。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2016年第12期|3501-3517|共17页
作者
Xiaomin Zhu; Ji Wang; Hui Guo; Dakai Zhu; Laurence T. Yang; Ling Liu;
展开▼
作者单位

Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, P. R. China;

Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, P. R. China;

School of Computer Science and Engineering, University of New South Wales, NSW, Australia;

Department of Computer Science, The University of Texas at San Antonio, San Antonio, TX;

Department of Computer Science, St. Francis Xavier University, Antigonish, NS, Canada;

College of Computing, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Fault tolerant systems; Cloud computing; Real-time systems; Processor scheduling; Dynamic scheduling; Resource management;

机译：容错系统;云计算;实时系统;处理器调度;动态调度;资源管理;

相似文献

外文文献
中文文献
专利

1. Elastic resource provisioning for scientific workflow scheduling in cloud under budget and deadline constraints [J] . Shi Jiyuan, Luo Junzhou, Dong Fang, Cluster computing . 2016,第1期

机译：在预算和期限约束下在云中为科学工作流调度提供弹性资源
2. FESTAL: Fault-Tolerant Elastic Scheduling Algorithm for Real-Time Tasks in Virtualized Clouds [J] . Wang Ji, Bao Weidong, Zhu Xiaomin, Computers, IEEE Transactions on . 2015,第9期

机译：FESTAL：用于虚拟化云中实时任务的容错弹性调度算法
3. Scheduling deadline constrained scientific workflows on dynamically provisioned cloud resources [J] . Vahid Arabnejad, Kris Bubendorfer, Bryan Ng Future generation computer systems . 2017,第octa期

机译：在动态预配置的云资源上安排截止日期限制的科学工作流
4. A Responsive Knapsack-Based Algorithm for Resource Provisioning and Scheduling of Scientific Workflows in Clouds [C] . Rodriguez Maria A., Buyya Rajkumar International Conference on Parallel Processing . 2015

机译：基于响应背包的资源配置和云中科学工作流调度算法
5. Energy-aware workflow scheduling and fuzzy logic based capacity provisioning in cloud environment. [D] . Pan, Yao. 2015

机译：云环境中基于能量感知的工作流调度和基于模糊逻辑的容量供应。
6. Cancer Diagnosis Epigenomics Scientific Workflow Scheduling in the Cloud Computing Environment Using an Improved PSO Algorithm [O] . Sadhasivam N, Balamurugan R, Pandi M 2018

机译：使用改进的PSO算法在云计算环境中进行癌症诊断表基因组学科学工作流程调度
7. Deadline Based Resource Provisioning and Scheduling Algorithm for Scientific Workflows on Clouds [O] . Maria Alej, Ra Rodriguez, Rajkumar Buyya 2015

机译：基于截止期的云计算科学工作流资源配置与调度算法

Fault-Tolerant Scheduling for Real-Time Scientific Workflows with Elastic Resource Provisioning in Virtualized Clouds

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅