首页> 外文会议>International Conference on Applied and Theoretical Computing and Communication Technology >Improving the efficiency of MapReduce scheduling algorithm in Hadoop
【24h】

Improving the efficiency of MapReduce scheduling algorithm in Hadoop

机译:提高Hadoop中Mapreduce调度算法的效率

获取原文

摘要

In a distributed computing environment, to support the processing of large data sets a free Java-based programming framework Hadoop plays a vital role. In Hadoop, MapReduce technique is used for processing and generating large datasets is used with a parallel distributed algorithm on a cluster. The benefit of using MapReduce is to automatically handle failures and hides the complexity of fault tolerance from the user. The Scheduling algorithm of FIFO(FIRST IN FIRST OUT) is used in Hadoop as default in which the jobs are executed in the order of their arrival. This method suits well for homogeneous cloud and results in poor performance on the heterogeneous cloud. Later the LATE (Longest Approximate Time to End) algorithm has been developed which reduces the FIFO's response time by a factor of 2. It gives better performance in heterogeneous environments. The three principles of LATE algorithms are i) prioritizing tasks to speculate ii) selecting fast nodes to run on iii) capping speculative tasks to prevent thrashing. It takes action on appropriate slow tasks and it could not compute the remaining time for tasks correctly and can't find the real slow tasks. Finally, an SAMR (Self-Adaptive MapReduce) scheduling algorithm is being introduced which can find the slow tasks dynamically by using the historical information recorded on each node to tune parameters. SAMR reduces the execution time by 25% when compared to FIFO and 14% when compared to LATE.
机译:在分布式计算环境中,为了支持大数据的处理设置,基于Java的编程框架Hadoop起到了重要作用。在Hadoop中,MapReduce技术用于处理并生成大型数据集在群集中与并行分布式算法一起使用。使用MapReduce的好处是自动处理故障并隐藏来自用户的容错的复杂性。 FIFO的调度算法(首先首先出局)在Hadoop中使用默认,其中作业按其到达的顺序执行。这种方法适合均匀的云,并导致异构云的性能差。后来,已经开发了晚期(最长的近似时间)算法,这将FIFO的响应时间减少了2倍。它在异构环境中提供了更好的性能。延迟算法的三个原则是i)优先考虑推测II)选择快节点以在III上运行的节点)覆盖推测任务以防止捶打。它对适当的缓慢任务采取行动,无法正确计算任务的剩余时间,无法找到真正的慢动力任务。最后,正在引入SAMR(自适应MapReduce)调度算法,其可以通过使用在每个节点上记录的历史信息来动态地找到慢任务。与FIFO相比,SAMR将执行时间减少25%,而在较晚时将其与FIFO和14%相比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号