【24h】

A Time Based Analysis of Data Processing on Hadoop Cluster

机译:基于时间的Hadoop集群数据处理分析

获取原文

摘要

Data when it becomes in that much amount that it cannot be managed by the traditional database management system then it is Big data. It is difficult to manage this much amount of the data. Hadoop is a technological answer to the Big Data. Data storage and retrieval of information from the data is done by the Hadoop Distributed File System and the Map Reduce Programming model. MapReduce provides effective bench marks for retrieving the information from the Big Data. In this paper we present our experimental work done on the Hadoop Cluster. We have analyzed the time required by the cluster for processing the data with increasing number of nodes into the cluster. We started with a single node and then increase the node by one each time. We have analyzed three types of time. The real time, user time, system time is analyzed.
机译:当数据量大到传统数据库管理系统无法管理的数据时,它就是大数据。管理如此大量的数据非常困难。 Hadoop是大数据的技术解决方案。数据存储和从数据中检索信息是由Hadoop分布式文件系统和Map Reduce编程模型完成的。 MapReduce为从大数据中检索信息提供了有效的基准。在本文中,我们介绍了在Hadoop集群上完成的实验工作。我们分析了集群中处理节点所需数量增加的数据所需的时间。我们从一个节点开始,然后每次增加一个节点。我们分析了三种类型的时间。实时,用户时间,系统时间进行了分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号