首页> 外文会议>International Convention on Information and Communication Technology, Electronics and Microelectronics >Achieving dynamic workflow management system by applying provenance based checkpointing method
【24h】

Achieving dynamic workflow management system by applying provenance based checkpointing method

机译:应用基于出处的检查点方法实现动态工作流管理系统

获取原文

摘要

Scientific workflows are data and compute intensive thus may run for days or even for weeks on parallel and distributed infrastructures such as HPC systems and cloud. In HPC environment the number of failures that can arise during scientific workflow enactment can be high so the use of fault tolerance techniques is unavoidable. The most frequently used fault tolerance techniques are job replication and checkpointing. While job replication is based on the assumption that the probability of single failures is much higher than of simultaneous failures, the checkpointing saves certain states and the execution can be restarted from that point later on. The effectiveness of the checkpointing method depends on the checkpointing interval. Common technique is to dynamically adapt the checkpointing interval. In this work we give a brief overview of the different checkpointing techniques and propose a new provenance based dynamic checkpointing method.
机译:科学的工作流程是数据和计算密集型的,因此在并行和分布式基础架构(例如HPC系统和云)上可以运行数天甚至数周。在HPC环境中,科学的工作流程制定过程中可能会出现大量故障,因此不可避免地要使用容错技术。最常用的容错技术是作业复制和检查点。尽管作业复制是基于单个故障的可能性比同时发生故障的可能性高得多的假设进行的,但检查点会保存某些状态,并且以后可以从该点重新开始执行。检查点方法的有效性取决于检查点间隔。通用技术是动态调整检查点间隔。在这项工作中,我们简要概述了各种检查点技术,并提出了一种新的基于出处的动态检查点方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号