首页> 外文期刊>The VLDB journal >Solving the data sparsity problem in destination prediction
【24h】

Solving the data sparsity problem in destination prediction

机译:解决目的地预测中的数据稀疏性问题

获取原文
获取原文并翻译 | 示例
           

摘要

Destination prediction is an essential task for many emerging location-based applications such as recommending sightseeing places and targeted advertising according to destinations. A common approach to destination prediction is to derive the probability of a location being the destination based on historical trajectories. However, almost all the existing techniques use various kinds of extra information such as road network, proprietary travel planner, statistics requested from government, and personal driving habits. Such extra information, in most circumstances, is unavailable or very costly to obtain. Thereby we approach the task of destination prediction by using only historical trajectory dataset. However, this approach encounters the "data sparsity problem", i.e., the available historical trajectories are far from enough to cover all possible query trajectories, which considerably limits the number of query trajectories that can obtain predicted destinations. We propose a novel method named Sub-Trajectory Synthesis (SubSyn) to address the data sparsity problem. SubSyn first decomposes historical trajectories into sub-trajectories comprising two adjacent locations, and then connects the sub-trajectories into "synthesised" trajectories. This process effectively expands the historical trajectory dataset to contain much more trajectories. Experiments based on real datasets show that SubSyn can predict destinations for up to ten times more query trajectories than a baseline prediction algorithm. Furthermore, the running time of the SubSyn-training algorithm is almost negligible for a large set of 1.9 million trajectories, and the SubSyn-prediction algorithm runs over two orders of magnitude faster than the baseline prediction algorithm constantly.
机译:目的地预测是许多新兴的基于位置的应用程序的一项重要任务,例如根据目的地推荐观光地点和定向广告。目的地预测的常用方法是根据历史轨迹得出位置是目的地的概率。但是,几乎所有现有技术都使用各种额外的信息,例如道路网络,专有的旅行计划员,政府要求的统计数据以及个人驾驶习惯。在大多数情况下,此类额外信息是无法获得或获取成本很高的。因此,我们仅使用历史轨迹数据集来完成目的地预测的任务。但是,这种方法遇到“数据稀疏性问题”,即,可用的历史轨迹远不足以覆盖所有可能的查询轨迹,这大大限制了可以获取预测目的地的查询轨迹的数量。我们提出了一种新的方法,称为子轨迹合成(SubSyn),以解决数据稀疏性问题。 SubSyn首先将历史轨迹分解为包含两个相邻位置的子轨迹,然后将这些子轨迹连接为“合成”轨迹。此过程有效地扩展了历史轨迹数据集,以包含更多轨迹。基于真实数据集的实验表明,SubSyn可以预测目的地的查询轨迹是基线预测算法的多达十倍。此外,对于一组190万条轨迹,SubSyn训练算法的运行时间几乎可以忽略不计,并且SubSyn预测算法的运行速度始终比基线预测算法快两个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号