首页> 中文期刊> 《电子设计工程》 >基于Hadoop的空间科学大数据的区域检索算法

基于Hadoop的空间科学大数据的区域检索算法

         

摘要

In order to support the rapid retrieval of spatial scientific big data,a distributed region search algorithm is put forward. The algorithm consists of two parts. One of them is index method of four dimensional spatial scientific data,and the other is distributed index structure.According to the KTS storage structure,two-levelindex structure is established Based onthe Block-Grid three-dimensional grid subdivision method.It includes global index among nodes and local index within one node.Under the distributed system architecture,index distribution strategy among nodes and data fault tolerance mechanism in distributed environment are conformed.The NSSC-Hadoop system is designed based on the Hadoop architecture. Conduct experiments by using several sets of data and compare the efficiency of Hadoop without index. It shows that the algorithm can improve the efficiency nearly 50 timesof big data retrieval.And with the increase of data,the advantage of algorithm is more obvious.%针对空间科学大数据的快速检索需求,提出了分布式区域检索算法.算法主要包括四维空间科学数据的索引方法和分布式四维空间科学数据的索引架构两部分.在KTS存储结构下,通过基于立方体的Block-Grid三维网格剖分方法建立两级空间索引结构,包括分布式节点间的全局索引和分布式节点内的局部索引;在分布式系统架构下,确定了索引在分布式主从节点的分布策略以及数据在分布式环境下的容错机制.基于Hadoop基础架构设计了NSSC-Hadoop系统,通过多组试验数据测试算法效率,并与直接基于Hadoop无索引遍历数据方式相比较,数据检索效率提高了将近50倍,随着数据量的增大,算法优势会更加明显.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号