首页> 外国专利> METHOD FOR CONSTRUCTING AND UTILIZING INDEX TO IMPROVE DATA PROCESSING PERFORMANCE BASED ON MAPREDUCE IN HADOOP ENVIRONMENT

METHOD FOR CONSTRUCTING AND UTILIZING INDEX TO IMPROVE DATA PROCESSING PERFORMANCE BASED ON MAPREDUCE IN HADOOP ENVIRONMENT

机译:HADOOP环境中基于映射的构造和利用指标以提高数据处理性能的方法

摘要

The present invention relates to a method for processing data based on MapReduce. Specifically, the present invention relates to a method for constructing and utilizing an index to improve the data processing performance based on MapReduce in a Hadoop environment, which constructs a secondary index to effectively process big data with a MapReduce method in the Hadoop environment, and utilizes the secondary index for MapReduce-based data processing. The method for constructing an index comprises the following steps. Each mapper for processing file splits outputs an intermediate result value to be transmitted to a reducer by using an offset, a length, and a key value (K). Then, the reducer calculates the total offset and the total length by finding the smallest offset section and the largest offset section in a list of each of record sections, and stores the total offset and the total length in a split-level Hadoop index file.;COPYRIGHT KIPO 2017
机译:基于MapReduce的数据处理方法技术领域本发明涉及一种基于MapReduce的数据处理方法。具体地,本发明涉及一种在Hadoop环境中基于MapReduce构建和利用索引以改善数据处理性能的方法,该方法构造二级索引以在Hadoop环境中利用MapReduce方法有效地处理大数据,并利用基于MapReduce的数据处理的二级索引。构造索引的方法包括以下步骤。用于处理文件分割的每个映射器通过使用偏移,长度和键值(K)来输出中间结果值,该中间结果值将被发送到缩减器。然后,Reducer通过在每个记录部分的列表中找到最小偏移量部分和最大偏移量部分来计算总偏移量和总长度,并将总偏移量和总长度存储在拆分级Hadoop索引文件中。 ; COPYRIGHT KIPO 2017

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号