首页> 外文会议>International conference on very large data bases >Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce
【24h】

Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce

机译:Hadoop-GIS:基于MapReduce的高​​性能空间数据仓库系统

获取原文

摘要

Support of high performance queries on large volumes of spatial data becomes increasingly important in many application domains, including geospatial problems in numerous fields, location based services, and emerging scientific applications that are increasingly data- and compute-intensive. The emergence of massive scale spatial data is due to the proliferation of cost effective and ubiquitous positioning technologies, development of high resolution imaging technologies, and contribution from a large number of community users. There are two major challenges for managing and querying massive spatial data to support spatial queries: the explosion of spatial data, and the high computational complexity of spatial queries. In this paper, we present Hadoop-GIS - a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop. Hadoop-GIS supports multiple types of spatial queries on MapReduce through spatial partitioning, customizable spatial query engine RESQUE, implicit parallel spatial query execution on MapReduce, and effective methods for amending query results through handling boundary objects. Hadoop-GIS utilizes global partition indexing and customizable on demand local spatial indexing to achieve efficient query processing. Hadoop-GIS is integrated into Hive to support declarative spatial queries with an integrated architecture. Our experiments have demonstrated the high efficiency of Hadoop-GIS on query response and high scalability to run on commodity clusters. Our comparative experiments have showed that performance of Hadoop-GIS is on par with parallel SDBMS and outperforms SDBMS for compute-intensive queries. Hadoop-GIS is available as a set of library for processing spatial queries, and as an integrated software package in Hive.
机译:在许多应用领域中,对大量空间数据的高性能查询的支持变得越来越重要,包括许多领域的地理空间问题,基于位置的服务以及数据和计算密集型的新兴科学应用。大规模空间数据的出现是由于具有成本效益的无处不在的定位技术的普及,高分辨率成像技术的发展以及众多社区用户的贡献。管理和查询海量空间数据以支持空间查询存在两个主要挑战:空间数据的爆炸式增长以及空间查询的高计算复杂性。在本文中,我们介绍了Hadoop-GIS-一种可扩展的高性能空间数据仓库系统,用于在Hadoop上运行大规模空间查询。 Hadoop-GIS通过空间分区,可定制的空间查询引擎RESQUE,在MapReduce上隐式并行空间查询执行以及通过处理边界对象修改查询结果的有效方法,在MapReduce上支持多种类型的空间查询。 Hadoop-GIS利用全局分区索引和按需定制的本地空间索引来实现高效的查询处理。 Hadoop-GIS集成到Hive中,以集成架构支持声明性空间查询。我们的实验证明了Hadoop-GIS在查询响应上的高效率以及在商品集群上运行的高可伸缩性。我们的比较实验表明,Hadoop-GIS的性能与并行SDBMS相当,并且在计算密集型查询方面的性能优于SDBMS。 Hadoop-GIS可以作为一组用于处理空间查询的库,也可以作为Hive中的集成软件包使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号