A Big Data MapReduce Hadoop distribution architecture for processing input splits to solve the small data problem

机译：大数据MapReduce Hadoop分发架构，用于处理输入拆分以解决小数据问题

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Hadoop deals with big data which is an open source Java framework. There are two core components in it namely: HDFS (Hadoop distributed file system) is the ability of a system to continue normal operation against hardware or software faults using inexpensive hardware and which stocks huge extent of data another one is MapReduce is a processing technique and programming model done in lateral and scattered manner. Hadoop does not perform well for short data because huge amount of short data could be greater task on the NameNode of HDFS which inturn its execution time is prolonged for which MapReduce is encountered. While dealing with great amount of short data as it is particularly designed to handle huge amount of data, hadoop experienced with a performance cost. This analysis permits the indetail description of HDFS, actual ways to deal with the problems along with proposed approach to handle short data files and short data file problems. In proposed approach, small files are merged using programming model on hadoop known as MapReduce. By this approach of Hadoop performance of handling small files which is larger than block size is improved. We also propose a Traffic analyzer with the combination of Hadoop and Map-Reduce paradigm. The joint of Hadoop and MapReduce programming tools makes it possible to provide batch analysis in minimum response time and in memory computing capacity in order to process log in a high available, efficient and stable way.

机译：Hadoop处理作为开放源Java框架的大数据。其中有两个核心组件：HDFS（Hadoop分布式文件系统）是系统使用廉价的硬件来继续针对硬件或软件故障进行正常操作的能力，并且它可以存储大量数据，另一种是MapReduce是一种处理技术，横向和分散方式完成的编程模型。 Hadoop对于短数据的性能不佳，因为在HDFS的NameNode上，大量的短数据可能是更大的任务，这反过来会延长其执行时间（遇到MapReduce）。在处理大量短数据时（特别是为处理大量数据而设计的），Hadoop具有性能成本。这种分析允许对HDFS进行详细描述，处理问题的实际方法以及处理短数据文件和短数据文件问题的建议方法。在所提出的方法中，小文件是使用hadoop上的编程模型MapReduce合并的。通过Hadoop的这种方法，提高了处理大于块大小的小文件的性能。我们还提出了结合Hadoop和Map-Reduce范式的流量分析器。 Hadoop和MapReduce编程工具的结合使以最小的响应时间和内存计算能力提供批处理分析成为可能，从而以高可用性，高效和稳定的方式处理日志。

著录项

来源
《International Conference on Applied and Theoretical Computing and Communication Technology》|2016年|480-487|共8页
会议地点
作者
Manjunath R.; Tejus; Channabasava R.K; Balaji S.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Computer architecture; Robustness; Big Data; Databases; Servers; Java;

机译：计算机体系结构;稳健性;大数据;数据库;服务器; Java;

相似文献

外文文献
中文文献
专利

1. An improved integrated Grid and MapReduce-Hadoop architecture for spatial data: Hilbert TGS R Tree–based IGSIM [J] . Hari Singh, Seema Bawa CONCURRENCY PRACTICE & EXPERIENCE . 2019,第17期

机译：改进的用于空间数据的集成Grid和MapReduce-Hadoop架构：基于希尔伯特TGS R树的IGSIM
2. An improved integrated Grid and MapReduce-Hadoop architecture for spatial data: Hilbert TGS R Tree–based IGSIM [J] . Hari Singh, Seema Bawa CONCURRENCY PRACTICE & EXPERIENCE . 2019,第17期

机译：一种改进的集成电网和MapReduce-Hadoop架构，用于空间数据：希尔伯特TGS R树的IGSim
3. A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pig and Typical Mapreduce [J] . Anjali P P, Binu A International Journal of Computer science and engineering Survey (IJCSES) . 2014,第1期

机译：基于使用Hadoop Pig和典型Mapreduce处理网络流量数据的比较调查
4. A Big Data MapReduce Hadoop distribution architecture for processing input splits to solve the small data problem [C] . Manjunath R., Tejus, Channabasava R.K, International Conference on Applied and Theoretical Computing and Communication Technology . 2016

机译：用于处理输入分配的大数据MapReduce Hadoop分配架构以解决小数据问题
5. Data intensive query processing for Semantic Web data using Hadoop and MapReduce. [D] . Husain, Mohammad Farhan. 2011

机译：使用Hadoop和MapReduce对语义Web数据进行数据密集型查询处理。
6. FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy [O] . Umberto Ferraro Petrillo, Francesco Palini, Giuseppe Cattaneo, 2021

机译：Fasta / Q数据压缩机用于Mapreduce-Hadoop基因组学：空间和时间储蓄变得简单
7. High Performance Risk Aggregation: Addressing the Data Processing Challenge the Hadoop MapReduce Way [O] . Yao, Zhimin, Varghese, Blesson, Rau-Chaplin, Andrew 2013

机译：高性能风险聚合：解决数据处理问题挑战Hadoop mapReduce方式
8. High-Level Adaptive Signal Processing Architecture with Applications to Radar Non-Gaussian Clutter. Volume 2. A New Technique for Distribution Approximation of Random Data [R] . Shah, R. R. 1995

机译：高级自适应信号处理结构及其在雷达非高斯杂波中的应用。第2卷。随机数据分布逼近的新技术

A Big Data MapReduce Hadoop distribution architecture for processing input splits to solve the small data problem

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅