首页> 外文会议>Asia-Pacific Network Operations and Management Symposium >A load balance algorithm based on nodes performance in Hadoop cluster
【24h】

A load balance algorithm based on nodes performance in Hadoop cluster

机译:一种基于Hadoop集群节点性能的负载余量算法

获取原文

摘要

MapReduce is an important distributed programming model for large-scale data-parallel applications like web indexing, data mining, and scientific simulation. Hadoop is an open-source implementation of MapReduce and it is often applied to short jobs for which low response time is critical. When the cluster nodes are homogeneous, Hadoop has a good performance. In practice, the homogeneity assumptions do not always hold. In heterogeneous environment, there are various devices which vary greatly in the capacities of computation, communication, architectures, memories and power. When different nodes process the same amount of data, load balancing problem occurs. In this paper we address the problem of how to assign data after Map phase to balance the execution time of each Reduce task by proposing a novel load balancing algorithm based on nodes performance (LBNP), in which the input data of poor performance nodes are decreased. Simulation results indicate that all the Reduce tasks can be completed in the same time which shortens the whole Reduce phase. Thus the efficiency of MapReduce is improved.
机译:MapReduce是一个重要的分布式编程模型,用于大规模数据并行应用,如Web索引,数据挖掘和科学仿真。 Hadoop是MapReduce的开源实现,通常应用于响应时间低的短作业。当群集节点是均匀的时,Hadoop具有良好的性能。在实践中,同质性假设并不总是保持。在异构环境中,有各种设备在计算,通信,架构,存储器和功率的能力中大大变化。当不同节点处理相同量的数据时,会发生负载平衡问题。在本文中,我们通过提出基于节点性能(LBNP)的新型负载平衡算法,解决了如何在地图阶段分配数据以平衡每个减少任务的执行时间的问题。 。仿真结果表明,所有缩小任务都可以在同一时间内完成,缩短整个缩短阶段。因此,提高了Mapreduce的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号