首页> 外文期刊>ACM Transactions on Spatial Algorithms and Systems >Distributed Subtrajectory Join on Massive Datasets
【24h】

Distributed Subtrajectory Join on Massive Datasets

机译:分布式子标记加入大规模数据集

获取原文
获取原文并翻译 | 示例

摘要

Joining trajectory datasets is a significant operation in mobility data analytics and the cornerstone of various methods that aim to extract knowledge out of them. In the era of Big Data, the production of mobility data has become massive and, consequently, performing such an operation in a centralized way is not feasible. In this article, we address the problem of Distributed Subtrajectory Join processing by utilizing the MapReduce programming model. Compared to traditional trajectory join queries, this problem is even more challenging since the goal is to retrieve all the "maximal" portions of trajectories that are "similar." We propose three solutions: (ⅰ) a well-designed basic solution, coined DTJb; (ⅱ) a solution that uses a preprocessing step that repartitions the data, labeled DTJr; and (ⅲ) a solution that, additionally, employs an indexing scheme, named DTJi. In our experimental study, we utilize a 56GB dataset of real trajectories from the maritime domain, which, to the best of our knowledge, is the largest real dataset used for experimentation in the literature of trajectory data management. The results show that DTJi performs up to 16x faster compared with DTJb, 10× faster than DTJr, and 3× faster than the closest related state-of-the-art algorithm.
机译:加入轨迹数据集是移动数据分析的重要操作,以及各种方法的基石,旨在从其中提取知识。在大数据的时代,流动性数据的生产变得巨大,因此,以集中方式执行这种操作是不可行的。在本文中,我们通过利用MapReduce编程模型来解决分布式子标记加入处理的问题。与传统的轨迹加入查询相比,此问题更具挑战性,因为目标是检索“类似”的轨迹的所有“最大”部分。我们提出了三种解决方案:(Ⅰ)设计精心设计的基本解决方案,Coined DTJB; (Ⅱ)使用重新处理数据的预处理步骤的解决方案,标记为DTJR; (Ⅲ)另外,使用指定方案的解决方案命名为DTJI。在我们的实验研究中,我们利用了来自海上域的56GB数据集,这是我们所知的最大的真实数据集,用于轨迹数据管理文献中的实验。结果表明,与DTJB相比,DTJI比DTJR更快地执行16倍,比最接近的相关最新的算法快3×。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号