首页> 外文会议>International Symposium on Networks, Computers and Communications >On a Dynamic Data Placement Strategy for Heterogeneous Hadoop Clusters
【24h】

On a Dynamic Data Placement Strategy for Heterogeneous Hadoop Clusters

机译:关于异构Hadoop集群的动态数据放置策略

获取原文

摘要

Hadoop is one of the most popular distributed systems for big data computing in both industry and science communities. The default data placement strategy of Hadoop Distributed File System (HDFS), which was initially designed for homogenous environments, may suffer from performance degradation when deployed in heterogeneous clusters comprised of data nodes with disparate computing power and disk capacity, hence undermining the performance of MapReduce applications. In this paper, we use a Grey Forecast model to predict data hotness dynamically and determine an appropriate number of data block replicas on the fly. Based on such information, we further propose a dynamic data placement strategy (DDPS) to decide the best location for new replicas according to their hotness. The proposed method is able to dynamically adjust data replicas stored on each node in a heterogeneous Hadoop cluster and reduce the response time of big data applications. Experimental results on a heterogeneous Hadoop cluster show that DDPS together with the prediction model significantly increases application execution efficiency and improve MapReduce performance over the default HDFS configuration.
机译:Hadoop是行业和科学社区中最受欢迎的分布式系统之一。 Hadoop分布式文件系统(HDFS)的默认数据放置策略最初为均质环境而设计,可能会在由具有不同计算功率和磁盘容量的数据节点组成的异构集群中,因此破坏MapReduce的性能时,可能会遭受性能下降应用程序。在本文中,我们使用灰色预测模型动态地预测数据热,并在飞行中确定适当数量的数据块副本。基于此类信息,我们进一步提出了一种动态数据放置策略(DDPS),以根据其热度来确定新复制品的最佳位置。该方法能够动态地调整存储在异构Hadoop集群中的每个节点上的数据副本,并减少大数据应用的响应时间。异构Hadoop集群上的实验结果表明,DDP与预测模型一起显着提高了应用程序执行效率,并通过默认的HDFS配置提高MapReduce性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号