首页> 外文会议>International conference on distributed computing and internet technologies >Improving MapReduce Performance through Complexity and Performance Based Data Placement in Heterogeneous Hadoop Clusters
【24h】

Improving MapReduce Performance through Complexity and Performance Based Data Placement in Heterogeneous Hadoop Clusters

机译:通过复杂性和异构Hadoop集群中基于性能的数据放置来提高MapReduce性能

获取原文
获取外文期刊封面目录资料

摘要

MapReduce has emerged as an important programming model with clusters having tens of thousands of nodes. Hadoop, an open source implementation of MapReduce may contain various nodes which are heterogeneous in their computing capacity for various reasons. It is important for the data placement algorithms to partition the input and intermediate data based on the computing capacities of the nodes in the cluster. We propose several enhancements to data placing algorithms in Hadoop such that the load is distributed across the nodes evenly. In this work, we propose two techniques to measure the computing capacities of the nodes. Secondly, we propose improvements to the input data distribution algorithm based on the map and reduce function complexities and the measured heterogeneity of nodes. Finally, we evaluate the improvement of the MapReduce performance.
机译:MapReduce已成为具有数以万计节点的群集的重要编程模型。 Hadoop是MapReduce的开源实现,可能包含各种节点,由于各种原因,这些节点的计算能力不同。对于数据放置算法而言,根据群集中节点的计算能力对输入数据和中间数据进行分区非常重要。我们建议对Hadoop中的数据放置算法进行一些增强,以使负载均匀地分布在各个节点上。在这项工作中,我们提出了两种技术来测量节点的计算能力。其次,我们提出了对基于映射的输入数据分配算法的改进,并减少了功能复杂度和所测得的节点异质性。最后,我们评估MapReduce性能的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号