首页> 中文期刊> 《计算机工程与设计》 >基于Hadoop的MapReduce模型的研究与改进

基于Hadoop的MapReduce模型的研究与改进

         

摘要

According to the problem of the completion time of some Reduce Tasks varies greatly in MapReduce model, the factors of the completion time are studied, it is pointed out that the reason is data skewness between different Reduce Tasks and the improved model Map-Balance-Reduce is proposed. The Balance Tasks which are executed after Map Tasks balance the intermediate data to guarantee the data every Reduce Task fetches is equal. So the completion time of Reduce Task is little difference. Finally, the experimental result shows that the data which is processed by Balance Tasks is balanced for Reduce Tasks. The improved MapReduce model Map-Balance-Reduce proposed reduces the execution time of entire job.%针对MapReduce模型中存在的多个Reduce任务之间完成时间差别较大的问题,分析了影响Reduce任务完成时间的因素,指出了MapReduce模型中Reduce任务节点存在数据倾斜问题,提出了一种改进型的MapReduce模型MBR (Map-Balance-Reduce)模型.通过添加Balance任务,对Map任务处理完成的中间数据进行均衡操作,使得分配到Reduce 任务节点的数据比较均衡,从而确保Reduce任务的完成时间基本一致.仿真实验结果表明,经过Balance任务后,Map任务产生的中间数据能够比较均衡的分配给Reduce任务节点,达到数据计算均衡的目的,在一定程度上减少了整个作业的执行时间.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号