首页> 外文会议>2012 IEEE Eighth World Congress on Services >Fault Tolerant Clustering in Scientific Workflows
【24h】

Fault Tolerant Clustering in Scientific Workflows

机译:科学工作流中的容错群集

获取原文
获取原文并翻译 | 示例

摘要

Task clustering has been proven to be an effective method to reduce execution overhead and increase the computational granularity of workflow tasks executing on distributed resources. However, a job composed of multiple tasks may have a greater risk of suffering from failures than a job composed of a single task. Our theoretic analysis and simulation results demonstrate that failures can have a significant impact on the runtime performance of workflows that use existing clustering policies that ignore failures. We therefore propose two general failure modeling frameworks (task failure model and job failure model) to address these performance issues. We show the necessity to consider the fault tolerance in the task failure model. Based on the task failure model, we propose three methods to improve the workflow performance in dynamic environments. A simulation-based evaluation is performed and it shows that our approach can improve the workflow makespan significantly for two important applications.
机译:事实证明,任务聚类是减少执行开销并增加在分布式资源上执行的工作流任务的计算粒度的有效方法。但是,由多个任务组成的工作比由单个任务组成的工作遭受失败的风险更大。我们的理论分析和仿真结果表明,故障可能会对使用忽略故障的现有群集策略的工作流的运行时性能产生重大影响。因此,我们提出了两个通用的故障建模框架(任务故障模型和作业故障模型)来解决这些性能问题。我们展示了在任务失败模型中考虑容错能力的必要性。基于任务失败模型,我们提出了三种方法来提高动态环境中的工作流性能。进行了基于仿真的评估,结果表明我们的方法可以显着提高两个重要应用程序的工作流程有效期。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号