首页> 外文期刊>Future generation computer systems >Cloud infrastructure provenance collection and management to reproduce scientific workflows execution
【24h】

Cloud infrastructure provenance collection and management to reproduce scientific workflows execution

机译:云基础架构物产收集和管理,以重现科学的工作流程执行

获取原文
获取原文并翻译 | 示例
           

摘要

The emergence of Cloud computing provides a new computing paradigm for scientific workflow execution. It provides dynamic, on-demand and scalable resources that enable the processing of complex workflow-based experiments. With the ever growing size of the experimental data and increasingly complex processing workflows, the need for reproducibility has also become essential. Provenance has been thought of a mechanism to verify a workflow and to provide workflow reproducibility. One of the obstacles in reproducing an experiment execution is the lack of information about the execution infrastructure in the collected provenance. This information becomes critical in the context of Cloud in which resources are provisioned on-demand and by specifying resource configurations. Therefore, a mechanism is required that enables capturing of infrastructure information along with the provenance of workflows executing on the Cloud to facilitate the re-creation of execution environment on the Cloud. This paper presents a framework toReproduce Scientific Workflow Execution using Cloud-Aware Provenance (ReCAP), along with the proposed mapping approaches that aid in capturing the Cloud-aware provenance information and help in re-provisioning the execution resource on the Cloud with similar configurations. Experimental evaluation has shown the impact of different resource configurations on the workflow execution performance, therefore justifies the need for collecting such provenance information in the context of Cloud. The evaluation has also demonstrated that the proposed mapping approaches can capture Cloud information in various Cloud usage scenarios without causing performance overhead and can also enable the re-provisioning of resources on Cloud. Experiments were conducted using workflows from different scientific domains such as astronomy and neuroscience to demonstrate the applicability of this research for different workflows.
机译:云计算的出现为科学的工作流执行提供了新的计算范式。它提供了动态的,按需的和可扩展的资源,可以处理基于工作流的复杂实验。随着实验数据规模的不断扩大和处理流程的日益复杂,对可重复性的需求也变得至关重要。种源已经考虑了一种验证工作流并提供工作流可再现性的机制。再现实验执行过程的障碍之一是在所收集的出处中缺少有关执行基础结构的信息。在按需配置资源并通过指定资源配置的Cloud环境中,此信息变得至关重要。因此,需要一种机制,该机制能够捕获基础结构信息以及在云上执行的工作流的来源,以促进在云上重新创建执行环境。本文提出了一个框架,该框架使用云感知的源(ReCAP)来重现科学工作流执行,并提出了映射方法,该映射方法有助于捕获云感知的源信息并帮助以类似的配置在云上重新配置执行资源。实验评估显示了不同资源配置对工作流执行性能的影响,因此证明有必要在云环境中收集此类出处信息。评估还表明,所提出的映射方法可以在各种云使用情况下捕获云信息,而不会造成性能开销,并且还可以重新配置云上的资源。使用来自不同科学领域(如天文学和神经科学)的工作流程进行了实验,以证明该研究适用于不同的工作流程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号