首页> 外文期刊>Journal of grid computing >A Task-Based Greedy Scheduling Algorithm for Minimizing Energy of MapReduce Jobs
【24h】

A Task-Based Greedy Scheduling Algorithm for Minimizing Energy of MapReduce Jobs

机译:一种基于任务的贪婪调度算法,最大限度地减少MapReduce作业的能量

获取原文
获取原文并翻译 | 示例
           

摘要

MapReduce and its open source implementation, Hadoop, have gained widespread adoption for parallel processing of big data jobs. Since the number of such big data jobs is also rapidly rising, reducing their energy consumption is increasingly more important to reduce environmental impact as well as operational costs. Prior work by Mashayekhy et al. (IEEE Trans. Parallel Distributed Syst. 26 , 2720–2733, 2016), has tackled the problem of energy-aware scheduling of a single MapReduce job but we provide a far more efficient heuristic in this paper. We first model the problem as an Integer Linear Program to find the optimal solution using ILP solvers. Then we present a task-based greedy scheduling algorithm, TGSAVE, to select a slot for each task to minimize the total energy consumption of the MapReduce job for big data applications in heterogeneous environments without significant performance loss while satisfying the service level agreement (SLA). We perform several experiments on a Hadoop cluster to measure characteristics of tasks for nine different applications to evaluate our proposed algorithm. The results show that the total energy consumption of MapReduce jobs obtained by TGSAVE is up to 35% less than that achieved by EMRSA proposed in Mashayekhy et al. (IEEE Trans. Parallel Distributed Syst. 26 , 2720–2733, 2016), its closest rival, for same workloads. Besides, TGSAVE is capable of finding a solution in same order of time for up to 74% tighter deadlines than the tightest deadline that EMRSA can find a feasible one. On average, TGSAVE solution is approximately 1.4% far from the optimal solution, and it can meet deadlines as tight as 12%, on average, above the energy-oblivious minimum makespan in the benchmarks we examined.
机译:MapReduce及其开源实现,Hadoop,广泛采用了大数据作业的并行处理。由于这种大数据工作的数量也在迅速上升,因此降低了它们的能量消耗越来越重要,无法降低环境影响以及运营成本。 Mashayekhy等人的工作开始。 (IEEE Trans。并行分布式SYST。26,2720-2733,2016),解决了单个MapReduce工作的能量感知计划问题,但我们在本文中提供了更有效的启发式。我们首先将问题模型为整数线性程序,以查找使用ILP求解器的最佳解决方案。然后我们介绍基于任务的贪婪调度算法,TGSAVE,为每个任务选择一个插槽,以最大限度地减少MapReduce作业的总能量消耗,在异构环境中的大数据应用,而无需显着性能损失,同时满足服务级别协议(SLA) 。我们在Hadoop集群上执行多个实验,以测量九个不同应用程序的任务的特征,以评估我们所提出的算法。结果表明,TGSAVE获得的MapReduce作业的总能耗高于Mashayekhy等人提出的EMRSA所实现的35%。 (IEEE Trans。并行分布式SYST。26,2720-2733,2016),其最接近的竞争对手,适用于相同的工作负载。此外,TGSAVE能够在同一时间的时间内找到解决方案,最高可达74%的截止日期,而不是EMRSA可以找到可行的截止日期。平均而言,TGSAVE解决方案远非最佳解决方案大约1.4%,它可以将截止日期平均达到12%,平均而言,在我们检查的基准测试中的能量忽略的最低Mapspan之上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号