首页> 外文期刊>International Journal of Distributed and Parallel Systems >Scheduling Data Intensive Workloads through Virtualization on MapReduce based Clouds [
【24h】

Scheduling Data Intensive Workloads through Virtualization on MapReduce based Clouds [

机译:在基于MapReduce的云上通过虚拟化计划数据密集型工作负载[

获取原文
           

摘要

MapReduce has become a popular programming model for running data intensive applications on the cloud. Completion time goals or deadlines of MapReduce jobs set by users are becoming crucial in existing cloud-based data processing environments like Hadoop. There is a conflict between the scheduling MR jobs to meet deadlines and “data locality” (assigning tasks to nodes that contain their input data). To meet the deadline a task may be scheduled on a node without local input data for that task causing expensive data transfer from a remote node. In this paper, a novel scheduler is proposed to address the above problem which is primarily based on the dynamic resource reconfiguration approach. It has two components: 1) Resource Predictor: which dynamically determines the required number of Map/Reduce slots for every job to meet completion time guarantee; 2) Resource Reconfigurator: that adjusts the CPU resources while not violating completion time goals of the users by dynamically increasing or decreasing individual VMs to maximize data locality and also to maximize the use of resources within the system among the active jobs. The proposed scheduler has been evaluated against Fair Scheduler on virtual cluster built on a physical cluster of 20 machines. The results demonstrate a gain of about 12% increase in throughput of Jobs.
机译:MapReduce已成为在云上运行数据密集型应用程序的流行编程模型。用户设置的MapReduce作业的完成时间目标或截止日期在Hadoop等现有的基于云的数据处理环境中变得至关重要。在调度MR作业以满足最后期限和“数据局部性”(将任务分配给包含其输入数据的节点)之间存在冲突。为了满足最后期限,可以在没有该任务的本地输入数据的情况下在节点上调度任务,从而导致从远程节点进行昂贵的数据传输。在本文中,提出了一种新颖的调度程序来解决上述问题,该调度程序主要基于动态资源重新配置方法。它具有两个组成部分:1)资源预测器:动态确定每个作业所需的Map / Reduce插槽数量,以满足完成时间保证; 2)Resource Reconfigurator:在调整CPU资源的同时,不通过动态增加或减少单个VM来违反用户的完成时间目标,以最大化数据局部性,并在活动作业中最大化系统内资源的使用。已针对在20台计算机的物理群集上构建的虚拟群集上的Fair Scheduler对Fair Scheduler进行了评估。结果表明Jobs的吞吐量提高了约12%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号