An experimental approach towards big data for analyzing memory utilization on a hadoop cluster using HDFS and MapReduce

机译：一种使用HDFS和MapReduce的大数据实验方法，用于分析hadoop集群上的内存利用率

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

When the amount of data is very large and it cannot be handled by the conventional database management system, then it is called big data. Big data is creating new challenges for the data analyst. There can be three forms of data, structured form, unstructured form and semi structured form. Most of the part of bigdata is in unstructured form. Unstructured data is difficult to handle. The Apache Hadoop project provides better tools and techniques to handle this huge amount of data. A Hadoop distributed file system for storage and the MapReduce techniques for processing this data can be used. In this paper, we presented our experimental work done on big data using the Hadoop distributed file system and the MapReduce. We have analyzed the variable like amount of time spend by the maps and the reduce, different memory usages by the Mappers and the reducers. We have analyzed these variables for storage and processing of the data on a Hadoop cluster.

机译：当数据量非常大而常规数据库管理系统无法处理时，则称为大数据。大数据正在为数据分析师带来新的挑战。数据可以有三种形式，结构化形式，非结构化形式和半结构化形式。大数据的大部分部分都是非结构化形式。非结构化数据很难处理。 Apache Hadoop项目提供了更好的工具和技术来处理大量数据。可以使用用于存储的Hadoop分布式文件系统和用于处理此数据的MapReduce技术。在本文中，我们介绍了使用Hadoop分布式文件系统和MapReduce在大数据上完成的实验工作。我们已经分析了变量所花费的时间，例如映射所花费的时间量，以及映射器和缩减器所减少的内存使用量。我们已经分析了这些变量，用于在Hadoop集群上存储和处理数据。

著录项

来源
《First International Conference on Networks amp; Soft Computing》|2014年|442-447|共6页
会议地点 Guntur(IN)
作者
Pal Amrit; Agrawal Sanjay;
展开▼
作者单位

Dept of Computer Engineering and Application, National Institute of Technical Teachers#039;

Training and Research, Bhopal, India;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Benchmark testing; Big data; Cloud computing; Distributed databases; File systems; Memory management; Monitoring; Data Node; HDFS; MapReduce; Name Node; SLOTS_MILLIS_MAPS; SLOTS_MILLIS_REDUCES; Secondary NameNode;

机译：基准测试；大数据;云计算;分布式数据库；文件系统；内存管理;监控；数据节点； HDFS； MapReduce;名称节点； SLOTS_MILLIS_MAPS; SLOTS_MILLIS_REDUCES;次要NameNode;;

相似文献

外文文献
中文文献
专利

1. Hadoop, MapReduce and HDFS: A Developers Perspective [J] . Mohd Rehan Ghazi, Durgaprasad Gangodkar Procedia Computer Science . 2015,第1期

机译：Hadoop，MapReduce和HDFS：开发人员观点
2. A Duplicate Data Detection Approach Based on MapReduce and HDFS [J] . Wei Fang, Xue-Zhi Wen, Yu Zheng Recent patents on computer science . 2017,第2期

机译：基于MapReduce和HDFS的重复数据检测方法
3. MHDFS: A Memory-Based Hadoop Framework for Large Data Storage [J] . Song Aibo, Zhao Maoxian, Xue Yingying, Scientific programming . 2016,第Pta1期

机译：MHDFS：用于大数据存储的基于内存的Hadoop框架
4. An experimental approach towards big data for analyzing memory utilization on a hadoop cluster using HDFS and MapReduce [C] . Pal Amrit, Agrawal Sanjay International Conference on Networks Soft Computing . 2014

机译：使用HDFS和MapReduce对Hadoop集群中内存利用率分析内存利用的实验方法
5. Handling big data with a data-aware HDFS using evolutionary clustering technique. [D] . Hajeer, Mustafa Hussein. 2016

机译：使用进化聚类技术通过数据感知的HDFS处理大数据。
6. CloudDOE: A User-Friendly Tool for Deploying Hadoop Clouds and Analyzing High-Throughput Sequencing Data with MapReduce [O] . Wei-Chun Chung, Chien-Chih Chen, Jan-Ming Ho, -1

机译：CloudDOE：一种易于使用的工具用于部署Hadoop云并使用MapReduce分析高通量测序数据
7. Evaluation of Optimised Apriori Algorithm on HDFS using MapReduce in Hadoop Distributed Mode [O] . 2017

机译：Hadoop分布式模式中MapReveCums优化APRIORI算法的评估

An experimental approach towards big data for analyzing memory utilization on a hadoop cluster using HDFS and MapReduce

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅