首页> 外文OA文献 >Grid-centric scheduling strategies for workflow applications
【2h】

Grid-centric scheduling strategies for workflow applications

机译:工作流应用程序的以网格为中心的调度策略

摘要

Grid computing faces a great challenge because the resources are not localized, but distributed, heterogeneous and dynamic. Thus, it is essential to provide a set of programming tools that execute an application on the Grid resources with as little input from the user as possible. The thesis of this work is that Grid-centric scheduling techniques of workflow applications can provide good usability of the Grid environment by reliably executing the application on a large scale distributed system with good performance. We support our thesis with new and effective approaches in the following five aspects.First, we modeled the performance of the existing scheduling approaches in a multi-cluster Grid environment. We implemented several widely-used scheduling algorithms and identified the best candidate. The study further introduced a new measurement, based on our experiments, which can improve the schedule quality of some scheduling algorithms as much as 20 fold in a multi-cluster Grid environment.Second, we studied the scalability of the existing Grid scheduling algorithms. To deal with Grid systems consisting of hundreds of thousands of resources, we designed and implemented a novel approach that performs explicit resource selection decoupled from scheduling Our experimental evaluation confirmed that our decoupled approach can be scalable in such an environment without sacrificing the quality of the schedule by more than 10%.Third, we proposed solutions to address the dynamic nature of Grid computing with a new cluster-based hybrid scheduling mechanism. Our experimental results collected from real executions on production clusters demonstrated that this approach produces programs running 30% to 100% faster than the other scheduling approaches we implemented on both reserved and shared resources.Fourth, we improved the reliability of Grid computing by incorporating fault- tolerance and recovery mechanisms into the workow application execution. Our experiments on a simulated multi-cluster Grid environment demonstrated the effectiveness of our approach and also characterized the three-way trade-off between reliability, performance and resource usage when executing a workflow application.Finally, we improved the large batch-queue wait time often found in production Grid clusters. We developed a novel approach to partition the workow application and submit them judiciously to achieve less total batch-queue wait time. The experimental results derived from production site batch queue logs show that our approach can reduce total wait time by as much as 70%.Our approaches combined can greatly improve the usability of Grid computing while increasing the performance of workow applications on a multi-cluster Grid environment.
机译:网格计算面临着巨大的挑战,因为资源不是本地化的,而是分布式的,异构的和动态的。因此,必须提供一组编程工具,这些编程工具可以在Grid资源上执行应用程序,而用户的输入应尽可能少。这项工作的主题是,工作流应用程序以网格为中心的调度技术可以通过在性能良好的大型分布式系统上可靠地执行应用程序,从而为网格环境提供良好的可用性。我们从以下五个方面为我们的论文提供了有效的支持。首先,我们在多集群网格环境中对现有调度方法的性能进行了建模。我们实施了几种广泛使用的调度算法,并确定了最佳人选。本研究在实验的基础上进一步引入了一种新的度量方法,可以在多集群Grid环境中将某些调度算法的调度质量提高多达20倍。其次,我们研究了现有Grid调度算法的可伸缩性。为了处理由成千上万个资源组成的Grid系统,我们设计并实现了一种新颖的方法,该方法执行显式的资源选择与调度的分离。我们的实验评估证实,在这样的环境下,我们的分离方法可以在不牺牲调度质量的情况下进行扩展。第三,我们提出了一种基于新的基于集群的混合调度机制来解决Grid计算的动态特性的解决方案。我们从生产集群上的实际执行中收集的实验结果表明,与在预留资源和共享资源上实施的其他调度方法相比,该方法所生成的程序运行速度快了30%至100%。第四,我们通过合并故障-提高了网格计算的可靠性。容忍和恢复机制进入工作流应用程序执行。我们在模拟的多集群Grid环境上进行的实验证明了我们的方法的有效性,并且在执行工作流应用程序时还描述了可靠性,性能和资源使用之间的三权衡。最后,我们改善了大批处理队列等待时间通常在生产网格集群中找到。我们开发了一种新颖的方法来划分工作流程应用程序,并明智地提交它们,以减少总的批队列等待时间。从生产现场批处理队列日志得出的实验结果表明,我们的方法可以减少70%的总等待时间。我们的方法相结合可以极大地提高Grid计算的可用性,同时提高多集群Grid上工作流应用程序的性能。环境。

著录项

  • 作者

    Zhang Yang;

  • 作者单位
  • 年度 2010
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号