...
首页> 外文期刊>Engineering Applications of Artificial Intelligence >Reinforcement Learning based scheduling in a workflow management system
【24h】

Reinforcement Learning based scheduling in a workflow management system

机译:工作流管理系统中基于强化学习的计划

获取原文
获取原文并翻译 | 示例
           

摘要

Any computational process from simple data analytics tasks to training a machine learning model can be described by a workflow. Many workflow management systems (WMS) exist that undertake the task of scheduling workflows across distributed computational resources. In this work, we introduce a WMS that leverages machine learning to predict workflow task runtime and the probability of failure of task assignments to execution sites. The expected runtime of workflow tasks can be used to approximate the weight of the workflow graph branches in respect to the total workflow workload and the ability to anticipate task failures can discourage task assignments that are unlikely to succeed. We demonstrate that the proposed machine learning models can lead to significantly more informed scheduling decisions that minimize task failures and utilize execution sites more efficiently, thus leading to reduced workflow runtime. Additionally, we train a modified sequence-to-sequence neural network architecture via reinforcement learning to perform scheduling decisions as part of a WMS. Our approach introduces a WMS that can drastically improve its scheduling performance by independently learning over time, without external intervention or reliance on any specific heuristic or optimization technique. Finally, we test our approach in real-world scenarios utilizing computationally demanding and data intensive workflows and evaluate its performance against existing scheduling methodologies traditionally used in WMSes. The performance evaluation outcome confirms that the proposed approach significantly outperforms the other scheduling algorithms in a consistent manner and achieves the best execution runtime with the lowest number of failed tasks and communication costs.
机译:从简单的数据分析任务到训练机器学习模型的任何计算过程都可以由工作流描述。存在许多工作流管理系统(WMS),它们承担着跨分布式计算资源调度工作流的任务。在这项工作中,我们介绍了一种WMS,该WMS利用机器学习来预测工作流任务运行时以及任务分配给执行站点的失败概率。工作流任务的预期运行时间可用于估计工作流图分支相对于总工作流工作量的权重,并且预期任务失败的能力会阻止不太可能成功的任务分配。我们证明了所提出的机器学习模型可以导致更明智的调度决策,从而最大程度地减少任务失败并更有效地利用执行站点,从而减少工作流运行时间。此外,我们通过强化学习来训练改进的序列到序列神经网络体系结构,以作为WMS的一部分执行调度决策。我们的方法引入了WMS,该WMS可通过随着时间的推移独立学习而极大地提高其调度性能,而无需外部干预或依赖任何特定的启发式或优化技术。最后,我们在实际场景中使用计算量大和数据量大的工作流测试我们的方法,并根据WMS中传统使用的现有调度方法评估其性能。性能评估结果证实,所提出的方法以一致的方式明显优于其他调度算法,并以最少的失败任务和通信成本获得了最佳的执行时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号