On a Dynamic Data Placement Strategy for Heterogeneous Hadoop Clusters

机译：关于异构Hadoop集群的动态数据放置策略

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Hadoop is one of the most popular distributed systems for big data computing in both industry and science communities. The default data placement strategy of Hadoop Distributed File System (HDFS), which was initially designed for homogenous environments, may suffer from performance degradation when deployed in heterogeneous clusters comprised of data nodes with disparate computing power and disk capacity, hence undermining the performance of MapReduce applications. In this paper, we use a Grey Forecast model to predict data hotness dynamically and determine an appropriate number of data block replicas on the fly. Based on such information, we further propose a dynamic data placement strategy (DDPS) to decide the best location for new replicas according to their hotness. The proposed method is able to dynamically adjust data replicas stored on each node in a heterogeneous Hadoop cluster and reduce the response time of big data applications. Experimental results on a heterogeneous Hadoop cluster show that DDPS together with the prediction model significantly increases application execution efficiency and improve MapReduce performance over the default HDFS configuration.

机译：Hadoop是行业和科学社区中最受欢迎的分布式系统之一。 Hadoop分布式文件系统（HDFS）的默认数据放置策略最初为均质环境而设计，可能会在由具有不同计算功率和磁盘容量的数据节点组成的异构集群中，因此破坏MapReduce的性能时，可能会遭受性能下降应用程序。在本文中，我们使用灰色预测模型动态地预测数据热，并在飞行中确定适当数量的数据块副本。基于此类信息，我们进一步提出了一种动态数据放置策略（DDPS），以根据其热度来确定新复制品的最佳位置。该方法能够动态地调整存储在异构Hadoop集群中的每个节点上的数据副本，并减少大数据应用的响应时间。异构Hadoop集群上的实验结果表明，DDP与预测模型一起显着提高了应用程序执行效率，并通过默认的HDFS配置提高MapReduce性能。

著录项

来源
《International Symposium on Networks, Computers and Communications》|2018年|616p|共7页
会议地点
作者
Yang Liu; Chase Q. Wu; Meng Wang; Aiqin Hou; Yongqiang Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词
Predictive models; Data models; Mathematical model; Big Data applications; Real-time systems; Differential equations;

机译：预测模型;数据模型;数学模型;大数据应用;实时系统;微分方程;

相似文献

外文文献
中文文献
专利

1. HaDaap: A hotness-aware data placement strategy for improving storage efficiency in heterogeneous Hadoop clusters [J] . Runqun Xiong, Yao Du, Jiahui Jin, Concurrency and computation: practice and experience . 2018,第20期

机译：HaDaap：一种感知热点的数据放置策略，用于提高异构Hadoop集群中的存储效率
2. A novel entropy-based dynamic data placement strategy for data intensive applications in Hadoop clusters [J] . K. Hemant Kumar Reddy, Vishal Pandey, Diptendu Sinha Roy International Journal of Big Data Intelligence . 2019,第1期

机译：针对Hadoop集群中数据密集型应用程序的基于熵的新颖动态数据放置策略
3. A Dynamic Data Placement Strategy for Hadoop in Heterogeneous Environments [J] . Chia-Wei Lee, Kuang-Yu Hsieh, Sun-Yuan Hsieh, Big Data Research . 2014,第Null期

机译：异构环境中Hadoop的动态数据放置策略
4. On a Dynamic Data Placement Strategy for Heterogeneous Hadoop Clusters [C] . Yang Liu, Chase Q. Wu, Meng Wang, International Symposium on Networks, Computers and Communications . 2018

机译：异构Hadoop集群的动态数据放置策略
5. Accelerating Mahout on heterogeneous clusters using HadoopCL. [D] . Li, Xiangyu. 2015

机译：使用HadoopCL在异构集群上加速Mahout。
6. SPECTRAL CLUSTERING STRATEGIES FOR HETEROGENEOUS DISEASE EXPRESSION DATA [O] . GRACE T. HUANG, KATHRYN I. CUNNINGHAM, PANAYIOTIS V. BENOS, -1

机译：谱聚类策略异质性疾病表达数据
7. An Improved Data Placement Strategy in a Heterogeneous Hadoop Cluster [O] . Wentao Zhao, Lingjun Meng, Jiangfeng Sun, 2015

机译：异构Hadoop集群中的改进数据放置策略

On a Dynamic Data Placement Strategy for Heterogeneous Hadoop Clusters

摘要

著录项

相似文献

相关主题

期刊订阅