首页> 外文会议>First International Conference on Networks amp; Soft Computing >An experimental approach towards big data for analyzing memory utilization on a hadoop cluster using HDFS and MapReduce
【24h】

An experimental approach towards big data for analyzing memory utilization on a hadoop cluster using HDFS and MapReduce

机译:一种使用HDFS和MapReduce的大数据实验方法,用于分析hadoop集群上的内存利用率

获取原文
获取原文并翻译 | 示例

摘要

When the amount of data is very large and it cannot be handled by the conventional database management system, then it is called big data. Big data is creating new challenges for the data analyst. There can be three forms of data, structured form, unstructured form and semi structured form. Most of the part of bigdata is in unstructured form. Unstructured data is difficult to handle. The Apache Hadoop project provides better tools and techniques to handle this huge amount of data. A Hadoop distributed file system for storage and the MapReduce techniques for processing this data can be used. In this paper, we presented our experimental work done on big data using the Hadoop distributed file system and the MapReduce. We have analyzed the variable like amount of time spend by the maps and the reduce, different memory usages by the Mappers and the reducers. We have analyzed these variables for storage and processing of the data on a Hadoop cluster.
机译:当数据量非常大而常规数据库管理系统无法处理时,则称为大数据。大数据正在为数据分析师带来新的挑战。数据可以有三种形式,结构化形式,非结构化形式和半结构化形式。大数据的大部分部分都是非结构化形式。非结构化数据很难处理。 Apache Hadoop项目提供了更好的工具和技术来处理大量数据。可以使用用于存储的Hadoop分布式文件系统和用于处理此数据的MapReduce技术。在本文中,我们介绍了使用Hadoop分布式文件系统和MapReduce在大数据上完成的实验工作。我们已经分析了变量所花费的时间,例如映射所花费的时间量,以及映射器和缩减器所减少的内存使用量。我们已经分析了这些变量,用于在Hadoop集群上存储和处理数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号