【24h】

Enhancing SpatialHadoop with Closest Pair Queries

机译:使用最近的对查询增强SpatialHadoop

获取原文

摘要

Given two datasets P and Q, the K Closest Pair Query (KCPQ) finds the K closest pairs of objects from P×Q. It is an operation widely adopted by many spatial and GIS applications. As a combination of the K Nearest Neighbor (KNN) and the spatial join queries, KCPQ is an expensive operation. Given the increasing volume of spatial data, it is difficult to perform a KCPQ on a centralized machine efficiently. For this reason, this paper addresses the problem of computing the KCPQ on big spatial datasets in SpatialHadoop, an extension of Hadoop that supports spatial operations efficiently, and proposes a novel algorithm in SpatialHadoop to perform efficient parallel KCPQ on large-scale spatial datasets. We have evaluated the performance of the algorithm in several situations with big synthetic and real-world datasets. The experiments have demonstrated the efficiency and scalability of our proposal.
机译:给定两个数据集p和q,k最近的对查询(kcpq)从p×q找到最接近的对象对。它是许多空间和GIS应用广泛采用的操作。作为K最近邻居(KNN)和空间连接查询的组合,KCPQ是昂贵的操作。鉴于增加空间数据量,很难有效地在集中机器上执行KCPQ。出于这个原因,本文解决了在SpatialHadoop中计算了在SpatialHadoop中的大型空间数据集上的KCPQ的问题,其支持有效地支持空间操作,并提出了一种在SpatialHadoop中的新算法,在大规模空间数据集上执行有效的并行KCPQ。我们已经评估了算法在具有大合成和现实世界数据集的几种情况下的性能。实验表明了我们提案的效率和可扩展性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号