首页> 外文期刊>Journal of Grid Computing >A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds
【24h】

A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds

机译:云中并行科学工作流的基于源的自适应调度启发式

获取原文
获取原文并翻译 | 示例

摘要

In the last years, scientific workflows have emerged as a fundamental abstraction for structuring and executing scientific experiments in computational environments. Scientific workflows are becoming increasingly complex and more demanding in terms of computational resources, thus requiring the usage of parallel techniques and high performance computing (HPC) environments. Meanwhile, clouds have emerged as a new paradigm where resources are virtualized and provided on demand. By using clouds, scientists have expanded beyond single parallel computers to hundreds or even thousands of virtual machines. Although the initial focus of clouds was to provide high throughput computing, clouds are already being used to provide an HPC environment where elastic resources can be instantiated on demand during the course of a scientific workflow. However, this model also raises many open, yet important, challenges such as scheduling workflow activities. Scheduling parallel scientific workflows in the cloud is a very complex task since we have to take into account many different criteria and to explore the elasticity characteristic for optimizing workflow execution. In this paper, we introduce an adaptive scheduling heuristic for parallel execution of scientific workflows in the cloud that is based on three criteria: total execution time (makespan), reliability and financial cost. Besides scheduling workflow activities based on a 3-objective cost model, this approach also scales resources up and down according to the restrictions imposed by scientists before workflow execution. This tuning is based on provenance data captured and queried at runtime. We conducted a thorough validation of our approach using a real bioinformatics workflow. The experiments were performed in SciCumulus, a cloud workflow engine for managing scientific workflow execution.
机译:在过去的几年中,科学工作流已经成为在计算环境中构造和执行科学实验的基本抽象。科学工作流程变得越来越复杂,并且对计算资源的要求也越来越高,因此需要使用并行技术和高性能计算(HPC)环境。同时,云已经成为一种新的范例,其中虚拟化资源并按需提供。通过使用云,科学家已经从单台并行计算机扩展到了数百甚至数千个虚拟机。尽管云的最初重点是提供高吞吐量的计算,但云已经被用于提供HPC环境,在该环境中,可以在科学工作流程中按需实例化弹性资源。但是,此模型还提出了许多开放但重要的挑战,例如安排工作流程活动。在云中安排并行科学工作流是一项非常复杂的任务,因为我们必须考虑许多不同的标准并探索用于优化工作流执行的弹性特征。在本文中,我们介绍了一种基于三个标准的,用于并行执行云中科学工作流的自适应调度启发式方法:总执行时间(makespan),可靠性和财务成本。除了基于3目标成本模型安排工作流程活动外,此方法还根据科学家在工作流程执行之前施加的限制来按比例放大和缩小资源。此调整基于在运行时捕获并查询的出处数据。我们使用真实的生物信息学工作流程对我们的方法进行了全面验证。实验在SciCumulus(用于管理科学工作流程执行的云工作流程引擎)中进行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号