首页> 外文OA文献 >A Comparison of Distributed Spatial Data Management Systems for Processing Distance Join Queries
【2h】

A Comparison of Distributed Spatial Data Management Systems for Processing Distance Join Queries

机译:处理距离连接查询的分布式空间数据管理系统比较

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Due to the ubiquitous use of spatial data applications and the large amounts of spatial data that these applications generate, the processing of large-scale distance joins in distributed systems is becoming increasingly popular. Two of the most studied distance join queries are the K Closest Pair Query (KCPQ) and the ε Distance Join Query (εDJQ). The KCPQ finds the K closest pairs of points from two datasets and the εDJQ finds all the possible pairs of points from two datasets, that are within a distance threshold ε of each other. Distributed cluster-based computing systems can be classified in Hadoop-based and Spark-based systems. Based on this classification, in this paper, we compare two of the most current and leading distributed spatial data management systems, namely SpatialHadoop and LocationSpark, by evaluating the performance of existing and newly proposed parallel and distributed distance join query algorithms in different situations with big real-world datasets. As a general conclusion, while SpatialHadoop is more mature and robust system, LocationSpark is the winner with respect to the total execution time.
机译:由于空间数据应用程序的普遍使用以及这些应用程序生成的大量空间数据,在分布式系统中进行大规模距离联接的处理变得越来越普遍。研究最多的两个距离联接查询是K最近配对查询(KCPQ)和ε距离联接查询(εDJQ)。 KCPQ从两个数据集中找到K个最接近的点对,而εDJQ从两个数据集中找到彼此可能的距离阈值ε之间的所有可能的点对。分布式基于群集的计算系统可以分为基于Hadoop和基于Spark的系统。在此分类的基础上,本文通过评估现有和新提出的并行和分布式距离联接查询算法在不同情况下的性能,比较了两个最先进且领先的分布式空间数据管理系统SpatialHadoop和LocationSpark。真实数据集。总的来说,尽管SpatialHadoop是一个更成熟,更强大的系统,但就总执行时间而言,LocationSpark是赢家。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号