【24h】

Improving Scheduling Efficiency of Hadoop YARN Using AFSA Algorithm

机译:使用AFSA算法提高Hadoop YARN的调度效率

获取原文
获取原文并翻译 | 示例

摘要

Apache Hadoop is one of the most popular MapReduce framework for parallel processing of large data sets. As the job scheduler and resource manager, YARN plays a very important role. Schedulers on YARN are designed to minimize the makespan of MapReduce jobs. The performance of a scheduler in YARN depends not only on whether the resource capacity of the working nodes are fully utilized, but also on the dependencies among those tasks. Therefore it is very difficult to achieve an optimal solution. This paper proposes a new Hadoop YARN scheduling algorithm. The algorithm formalizes the problem as a multiple knapsack problem which takes into consideration of the resource cost and time cost of each task as well as the dependency between different tasks. Artificial Fish Swarm Algorithm is adopted to solve the knapsack optimization problem. The algorithm was implemented as a pluggable scheduler on the most recent version of Hadoop YARN and evaluated with several MapReduce benchmarks. The experimental results show that our scheduler could effectively reduce the makespan of Hadoop jobs by 30% compared with some existing scheduling policies.
机译:Apache Hadoop是最流行的并行处理大数据集的MapReduce框架之一。作为作业调度程序和资源管理器,YARN扮演着非常重要的角色。 YARN上的调度程序旨在最大程度地减少MapReduce作业的有效期。 YARN中调度程序的性能不仅取决于工作节点的资源容量是否得到充分利用,还取决于这些任务之间的依赖性。因此,很难获得最佳解决方案。本文提出了一种新的Hadoop YARN调度算法。该算法将问题形式化为多背包问题,该问题考虑了每个任务的资源成本和时间成本以及不同任务之间的依赖性。采用人工鱼群算法解决背包优化问题。该算法已在最新版本的Hadoop YARN上实现为可插拔调度程序,并通过多个MapReduce基准进行了评估。实验结果表明,与现有的一些调度策略相比,我们的调度器可以有效地将Hadoop作业的有效期减少30%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号