Implementation of on-process aggregation for efficient big data processing in Hadoop MapReduce environment

机译：在Hadoop MapReduce环境中实现进程内聚合以高效处理大数据

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The term Big Data, refers to sizably voluminous data whose volume, variability, and velocity make it very arduous to manage, process or analyzed. To analyze this sizably voluminous kind of data Hadoop will be utilized. However, Processing is very time-consuming. To resolve this quandary & to decrement replication time one solution is to executing the job partially, where an approximate, early result becomes available to the utilizer, afore completion of job. Proposed system gives a more incipient MapReduce architecture that sanctions data to be divided for easier & early processing. This is not time consuming and amends system utilization for batch jobs as well. Proposed system presents a more incipient version of the Hadoop MapReduce framework that fortifies on-Process aggregation, which sanctions & avails users to get early results of a job as it is computing. It will evaluate this technique utilizing authentic-world datasets and applications and endeavor to amend the systems performance in terms of precision and time. Also the combiner introduced in this system is local reducer. Combiner will get execute after map function & before reducer. Instead of processing complete file on-process aggregation divides the file into number of blocks which helps to gives the result in slots. Dividing the file into number of data sets helps to give result as early as possible by giving intermediate result to the user. The objective of the proposed technique is to amend the performance of Hadoop MapReduce for efficient & easy Immensely Big Data Processing time.

机译：术语“大数据”是指相当大的数据，其数量，可变性和速度使管理，处理或分析变得非常困难。为了分析这种相当大量的数据，将使用Hadoop。但是，处理非常耗时。为了解决这一难题并减少复制时间，一种解决方案是部分执行作业，在作业完成之前，利用者可以获得近似的早期结果。拟议的系统提供了一种更早期的MapReduce架构，该架构可以对要分割的数据进行制裁，以便更轻松地进行早期处理。这不是费时的，并且还可以修改批处理作业的系统利用率。拟议的系统提出了Hadoop MapReduce框架的更早期版本，该版本加强了进程内聚合功能，该功能制裁并帮助用户在进行计算时获得工作的早期结果。它将利用真实世界的数据集和应用程序对该技术进行评估，并努力在精度和时间方面修改系统性能。同样，在该系统中引入的组合器是本地减速器。合并器将在map函数之后和reducer之前执行。除了处理完整的文件外，进程内聚合将文件划分为多个块，这有助于在插槽中提供结果。将文件划分为多个数据集有助于通过向用户提供中间结果来尽早提供结果。提出的技术的目的是为了提高Hadoop MapReduce的性能，以实现高效，便捷的大数据处理时间。

著录项

来源
《International Conference on Inventive Computation Technologies》|2016年|1-5|共5页
会议地点
作者
Vidya V. Pol; S.M. Patil;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Big data; Computer architecture; Data mining; Data analysis; Bandwidth; Algorithm design and analysis;

机译：大数据;计算机体系结构;数据挖掘;数据分析;带宽;算法设计与分析;

相似文献

外文文献
中文文献
专利

1. Efficient Implementation of Hadoop MapReduce based Business Process Dataflow [J] . Ishak H.A. Meddah, Khaled Belkadi, Mohamed Amine Boudia International journal of decision support system technology . 2017,第1期

机译：基于Hadoop MapReduce的业务流程数据流的高效实现
2. HADOOP MAPREDUCE IN CLOUD ENVIRONMENTS FOR SCIENTIFIC DATA PROCESSING [J] . KONG XIANGSHENG Journal of Theoretical and Applied Information Technology . 2013,第3期

机译：云环境中的HADOOP MAPREDUCE用于科学数据处理
3. Impact of Processing and Analyzing Healthcare Big Data on Cloud Computing Environment by Implementing Hadoop Cluster [J] . Sreekanth Rallapalli, R.R. Gondkar, Uma Pavan Kumar Ketavarapu Procedia Computer Science . 2016,第1期

机译：实施Hadoop集群对医疗大数据进行处理和分析对云计算环境的影响
4. Implementation of on-process aggregation for efficient big data processing in Hadoop MapReduce environment [C] . Vidya V. Pol, S.M. Patil International Conference on Inventive Computation Technologies . 2016

机译：Hadoop MapReduce环境中有效大数据处理的过程泛编的实现
5. Data intensive query processing for Semantic Web data using Hadoop and MapReduce. [D] . Husain, Mohammad Farhan. 2011

机译：使用Hadoop和MapReduce对语义Web数据进行数据密集型查询处理。
6. Efficient implementation of convolutional neural networks in the data processing of two-photon in vivo imaging [O] . Yangzhen Wang, Feng Su, Shanshan Wang, -1

机译：卷积神经网络在双光子体内成像数据处理中的高效实现
7. High Performance Risk Aggregation: Addressing the Data Processing Challenge the Hadoop MapReduce Way [O] . Yao, Zhimin, Varghese, Blesson, Rau-Chaplin, Andrew 2013

机译：高性能风险聚合：解决数据处理问题挑战Hadoop mapReduce方式

Implementation of on-process aggregation for efficient big data processing in Hadoop MapReduce environment

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅