首页> 外文会议>International Conference on Inventive Computation Technologies >Implementation of on-process aggregation for efficient big data processing in Hadoop MapReduce environment
【24h】

Implementation of on-process aggregation for efficient big data processing in Hadoop MapReduce environment

机译:在Hadoop MapReduce环境中实现进程内聚合以高效处理大数据

获取原文
获取外文期刊封面目录资料

摘要

The term Big Data, refers to sizably voluminous data whose volume, variability, and velocity make it very arduous to manage, process or analyzed. To analyze this sizably voluminous kind of data Hadoop will be utilized. However, Processing is very time-consuming. To resolve this quandary & to decrement replication time one solution is to executing the job partially, where an approximate, early result becomes available to the utilizer, afore completion of job. Proposed system gives a more incipient MapReduce architecture that sanctions data to be divided for easier & early processing. This is not time consuming and amends system utilization for batch jobs as well. Proposed system presents a more incipient version of the Hadoop MapReduce framework that fortifies on-Process aggregation, which sanctions & avails users to get early results of a job as it is computing. It will evaluate this technique utilizing authentic-world datasets and applications and endeavor to amend the systems performance in terms of precision and time. Also the combiner introduced in this system is local reducer. Combiner will get execute after map function & before reducer. Instead of processing complete file on-process aggregation divides the file into number of blocks which helps to gives the result in slots. Dividing the file into number of data sets helps to give result as early as possible by giving intermediate result to the user. The objective of the proposed technique is to amend the performance of Hadoop MapReduce for efficient & easy Immensely Big Data Processing time.
机译:术语“大数据”是指相当大的数据,其数量,可变性和速度使管理,处理或分析变得非常困难。为了分析这种相当大量的数据,将使用Hadoop。但是,处理非常耗时。为了解决这一难题并减少复制时间,一种解决方案是部分执行作业,在作业完成之前,利用者可以获得近似的早期结果。拟议的系统提供了一种更早期的MapReduce架构,该架构可以对要分割的数据进行制裁,以便更轻松地进行早期处理。这不是费时的,并且还可以修改批处理作业的系统利用率。拟议的系统提出了Hadoop MapReduce框架的更早期版本,该版本加强了进程内聚合功能,该功能制裁并帮助用户在进行计算时获得工作的早期结果。它将利用真实世界的数据集和应用程序对该技术进行评估,并努力在精度和时间方面修改系统性能。同样,在该系统中引入的组合器是本地减速器。合并器将在map函数之后和reducer之前执行。除了处理完整的文件外,进程内聚合将文件划分为多个块,这有助于在插槽中提供结果。将文件划分为多个数据集有助于通过向用户提供中间结果来尽早提供结果。提出的技术的目的是为了提高Hadoop MapReduce的性能,以实现高效,便捷的大数据处理时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号