首页> 外文期刊>Personal and Ubiquitous Computing >A probabilistic approach to mining mobile phone data sequences
【24h】

A probabilistic approach to mining mobile phone data sequences

机译:一种概率方法来挖掘手机数据序列

获取原文
获取原文并翻译 | 示例
           

摘要

We present a new approach to address the problem of large sequence mining from big data. The particular problem of interest is the effective mining of long sequences from large-scale location data to be practical for Reality Mining applications, which suffer from large amounts of noise and lack of ground truth. To address this complex data, we propose an unsupervised probabilistic topic model called the distant n-gram topic model (DNTM). The DNTM is based on latent Dirichlet allocation (LDA), which is extended to integrate sequential information. We define the generative process for the model, derive the inference procedure, and evaluate our model on both synthetic data and real mobile phone data. We consider two different mobile phone datasets containing natural human mobility patterns obtained by location sensing, the first considering GPS/wi-fi locations and the second considering cell tower connections. The DNTM discovers meaningful topics on the synthetic data as well as the two mobile phone datasets. Finally, the DNTM is compared to LDA by considering log-likelihood performance on unseen data, showing the predictive power of the model. The results show that the DNTM consistently outperforms LDA as the sequence length increases.
机译:我们提出了一种新方法来解决从大数据进行大序列挖掘的问题。感兴趣的特定问题是从大规模位置数据中有效地长序列的有效挖掘,这对于存在大量噪声和缺乏地面真实性的现实采矿应用来说是实用的。为了解决这个复杂的数据,我们提出了一个无监督的概率主题模型,称为远距离n-gram主题模型(DNTM)。 DNTM基于潜在的Dirichlet分配(LDA),该扩展被扩展为集成顺序信息。我们定义模型的生成过程,导出推理过程,并根据合成数据和真实手机数据评估我们的模型。我们考虑两个包含通过位置感应获得的自然人类移动性模式的不同手机数据集,第一个考虑GPS / wi-fi位置,第二个考虑蜂窝塔连接。 DNTM在合成数据以及两个手机数据集上发现有意义的主题。最后,通过考虑对看不见的数据的对数似然性能,将DNTM与LDA进行比较,从而显示模型的预测能力。结果表明,随着序列长度的增加,DNTM始终优于LDA。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号