首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Energy-Aware Scheduling of MapReduce Jobs for Big Data Applications
【24h】

Energy-Aware Scheduling of MapReduce Jobs for Big Data Applications

机译:大数据应用程序的MapReduce作业的能源感知调度

获取原文
获取原文并翻译 | 示例

摘要

The majority of large-scale data intensive applications executed by data centers are based on MapReduce or its open-source implementation, Hadoop. Such applications are executed on large clusters requiring large amounts of energy, making the energy costs a considerable fraction of the data center’s overall costs. Therefore minimizing the energy consumption when executing each MapReduce job is a critical concern for data centers. In this paper, we propose a framework for improving the energy efficiency of MapReduce applications, while satisfying the service level agreement (SLA). We first model the problem of energy-aware scheduling of a single MapReduce job as an Integer Program. We then propose two heuristic algorithms, called energy-aware MapReduce scheduling algorithms (EMRSA-I and EMRSA-II), that find the assignments of map and reduce tasks to the machine slots in order to minimize the energy consumed when executing the application. We perform extensive experiments on a Hadoop cluster to determine the energy consumption and execution time for several workloads from the HiBench benchmark suite including TeraSort, PageRank, and K-means clustering, and then use this data in an extensive simulation study to analyze the performance of the proposed algorithms. The results show that EMRSA-I and EMRSA-II are able to find near optimal job schedules consuming approximately 40 percent less energy on average than the schedules obtained by a common practice scheduler that minimizes the makespan.
机译:数据中心执行的大多数大型数据密集型应用程序都基于MapReduce或其开源实现Hadoop。此类应用程序在需要大量能源的大型集群上执行,从而使能源成本占数据中心总成本的相当一部分。因此,在执行每个MapReduce作业时最大程度地降低能耗是数据中心的关键问题。在本文中,我们提出了一个在满足服务水平协议(SLA)的同时提高MapReduce应用程序能效的框架。我们首先将单个MapReduce作业的能量感知调度问题建模为Integer程序。然后,我们提出了两种启发式算法,称为能量感知MapReduce调度算法(EMRSA-I和EMRSA-II),它们查找映射的分配并减少对计算机插槽的任务,以最大程度地减少执行应用程序时消耗的能量。我们在Hadoop集群上进行了广泛的实验,以确定HiBench基准套件(包括TeraSort,PageRank和K-means集群)中多个工作负载的能耗和执行时间,然后在广泛的模拟研究中使用此数据来分析性能。提出的算法。结果表明,EMRSA-I和EMRSA-II能够找到比最佳实践计划表所消耗的能源平均少约40%的最佳工作计划表,而该计划表比使用常规实践计划程序所能最大限度地减少制造期的计划表所消耗的能源少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号