首页> 外文会议>IEEE International Conference on Big Data >A Reinforcement Learning Based Resource Management Approach for Time-critical Workloads in Distributed Computing Environment
【24h】

A Reinforcement Learning Based Resource Management Approach for Time-critical Workloads in Distributed Computing Environment

机译:基于加强基于学习的分布式计算环境中的时间关键工作负载资源管理方法

获取原文

摘要

Many data analyzing applications highly rely on timely response from execution, and are referred as time-critical data analyzing applications. Due to frequent appearing of gigantic amount of data and analytical computations, running them on large scale distributed computing environments is often advantageous. The workload of big data applications is often hybrid, i.e., contains a combination of time-critical and regular non-time-critical applications. Resource management for hybrid workloads in complex distributed computing environment is becoming more critical and needs more studies. However, it is difficult to design rule-based approaches best suited for such complex scenarios because many complicated characteristics need to be taken into account.Therefore, we present an innovative reinforcement learning (RL) based resource management approach for hybrid workloads in distributed computing environment. We utilize neural networks to capture desired resource management model, use reinforcement learning with designed value definition to gradually improve the model and use ε-greedy methodology to extend exploration along the reinforcement process. The extensive experiments show that our obtained resource management solution through reinforcement learning is able to greatly surpass the baseline rule-based models. Specifically, the model is good at reducing both the missing deadline occurrences for time-critical applications and lowering average job delay for all jobs in the hybrid workloads. Our reinforcement learning based approach has been demonstrated to be able to provide an efficient resource manager for desired scenarios.
机译:许多数据分析应用程序高度依赖于执行的响应,并且被称为时间关键数据分析应用程序。由于频繁出现巨大的数据和分析计算,在大规模分布式计算环境中运行它们通常是有利的。大数据应用的工作量通常是混合的,即,包含时间关键和常规的非时关个时期应用程序的组合。复杂分布式计算环境中混合工作负载的资源管理变得越来越重要,需要更多的研究。然而,这是很难设计基于规则的方法最适合于这种复杂的情况,因为很多复杂的特性需要被考虑account.Therefore,提出了一种基于创新的强化学习(RL)在分布式计算环境中工作负载的混合资源管理办法。我们利用神经网络捕获所需的资源管理模型,使用强化学习与设计的值定义逐步改进模型,并使用ε-贪婪的方法沿着加强过程延长勘探。广泛的实验表明,我们通过加固学习获得的资源管理解决方案能够大大超越基于基于基于基于基于基于基于的模型。具体而言,该模型是擅长同时降低了丢失的最后期限发生时间关键型应用程序,降低平均延误工作在混合工作负载的所有作业。我们的强化基于学习的方法已经证明能够为所需场景提供有效的资源管理器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号