首页> 外文期刊>The VLDB journal >Parallel trajectory similarity joins in spatial networks
【24h】

Parallel trajectory similarity joins in spatial networks

机译:空间网络中的平行轨迹相似性

获取原文
获取原文并翻译 | 示例
           

摘要

The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider two cases of trajectory similarity joins (TS-Joins), including a threshold-based join (Tb-TS-Join) and a top-k TS-Join (k-TS-Join), where the objects are trajectories of vehicles moving in road networks. Given two sets of trajectories and a threshold , the Tb-TS-Join returns all pairs of trajectories from the two sets with similarity above . In contrast, the k-TS-Join does not take a threshold as a parameter, and it returns the top-k most similar trajectory pairs from the two sets. The TS-Joins target diverse applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide purposeful definitions of similarity. To enable efficient processing of the TS-Joins on large sets of trajectories, we develop search space pruning techniques and enable use of the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer search framework that lays the foundation for the algorithms for the Tb-TS-Join and the k-TS-Join that rely on different pruning techniques to achieve efficiency. For each trajectory, the algorithms first find similar trajectories. Then they merge the results to obtain the final result. The algorithms for the two joins exploit different upper and lower bounds on the spatiotemporal trajectory similarity and different heuristic scheduling strategies for search space pruning. Their per-trajectory searches are independent of each other and can be performed in parallel, and the mergings have constant cost. An empirical study with real data offers insight in the performance of the algorithms and demonstrates that they are capable of outperforming well-designed baseline algorithms by an order of magnitude.
机译:相似对象对的匹配(称为相似联接)是数据管理中的基本功能。我们考虑了两种轨迹相似联接(TS-Joins),包括基于阈值的联接(Tb-TS-Join)和前k个TS-Join(k-TS-Join),其中对象是车辆的轨迹在道路网中移动。给定两组轨迹和一个阈值,Tb-TS-Join返回两组具有以上相似性的所有轨迹对。相反,k-TS-Join不将阈值作为参数,而是从两组返回前k个最相似的轨迹对。 TS-Joins针对各种应用,例如轨迹近重复检测,数据清理,拼车推荐和交通拥堵预测。考虑到这些应用程序,我们提供了相似性的有目的的定义。为了能够在大量轨迹上高效处理TS-Joins,我们开发了搜索空间修剪技术,并允许使用现代处理器的并行处理能力。具体来说,我们提出了一个两阶段分而治之的搜索框架,该框架为Tb-TS-Join和k-TS-Join的算法奠定了基础,后者依赖于不同的修剪技术来实现效率。对于每个轨迹,算法首先找到相似的轨迹。然后他们合并结果以获得最终结果。两种连接的算法在时空轨迹相似性上采用不同的上限和下限,并且在搜索空间修剪上采用了不同的启发式调度策略。它们的每轨迹搜索彼此独立,并且可以并行执行,并且合并的成本恒定。对真实数据进行的实证研究提供了算法性能的见解,并证明了它们能够比精心设计的基线算法好一个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号