首页> 外文期刊>Big Data, IEEE Transactions on >Sampling Big Trajectory Data for Traversal Trajectory Aggregate Query
【24h】

Sampling Big Trajectory Data for Traversal Trajectory Aggregate Query

机译:采样遍历轨迹综合查询的大轨迹数据

获取原文
获取原文并翻译 | 示例
       

摘要

This paper defines and investigates a novel trajectory query, namely, Traversal Trajectory Aggregate (TTA) Query: Given a trajectory database and a pair of upstream and downstream spatio-temporal (ST) regions (i.e., spatial area coupled with a time interval), a TTA query aims to retrieve the total number of unique trajectories that traverse through these two ST regions. Such TTA queries play an important role in various urban applications, such as route planning, taxi dispatching, and location-based advertising. Two baselines can answer such TTA queries: (a) exact search (over the entire ST query regions) can obtain the exact answer, but it leads to extremely long running time when the ST query regions are huge; (b) uniform-sampling-based approaches estimate the query answer with sampled trajectories. However, the uniform sampling distribution may lead to significant estimation variance for TTA query, because traversal trajectories are relatively few and unevenly distributed in the query regions. To tackle these challenges, this paper proposes a novel Targeted Index Sampling (TIS) framework to answer TTA queries with high estimation accuracy. TIS employs a two-stage framework, with a Pilot Sampling Estimation (PSE) stage to estimate the distribution of trajectories in ST query region, and an Integrated Importance Sampling (IIS) stage, which collects trajectories with the importance sampling distribution obtained in PSE, and estimates the query result with an asymptotically unbiased estimator. Extensive experiments and case studies using a large-scale real taxi trajectory dataset from Shenzhen, China demonstrate that our TIS framework achieves $leq$10 percent estimation error with $geq$ 90 percent computational time reduction over exact search, and 50 percent reduction on estimation error (with similar running time) over uniform-distribution-based sampling approaches.
机译:本文定义并调查了一个新颖的轨迹查询,即遍历轨迹聚合(TTA)查询:给定轨迹数据库和一对上游和下游时空(ST)区域(即,与时间间隔耦合的空间区域), TTA查询旨在检索穿过这两个ST区域的横穿轨迹的总数。此类TTA查询在各种城市应用中发挥着重要作用,例如路线规划,出租车调度和基于位置的广告。两个基线可以回答此类TTA查询:(a) 精确的搜索(在整个ST查询区域)可以获得确切的答案,但是当ST查询区域巨大时,它会导致极长的运行时间;<斜体XMLNS:MML =“http://www.w3.org/1998/math/mathml”xmlns:xlink =“http://www.w3.org/1999/xlink”>(b) 基于统一的采样的方法估计采样轨迹的查询答案。然而,统一的采样分布可能导致TTA查询的显着估计方差,因为遍历轨迹在查询区域中相对较少并且分布不均匀。为了解决这些挑战,本文提出了一种新颖的目标指数采样(TIS)框架,以应对高估计精度的TTA查询。 TIS采用两阶段框架,具有导频采样估计(PSE)阶段来估算ST查询区域中的轨迹的分布,以及集成的重要性采样(IIS)阶段,该阶段收集PSE中获得的重要采样分布轨迹,并估计查询结果与渐近无偏的估计器。广泛的实验和案例研究 使用深圳的大型真正的出租车轨迹数据集表明我们的TIS框架达到了 <内联公式> $ leq $ <替代品> <内联图xlink: HREF =“周 - IEQ1-2830780.gif”/> 10%的估计误差 <内联公式> $ geq $ <替代品> <内联 - 图形xlink: HREF =“周 - IEQ2-2830780.gif”/> 在精确的搜索中减少90%,对基于统一分布的采样方法的估计误差(具有类似的运行时间)减少50%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号