首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Efficient Identification of Local Keyword Patterns in Microblogging Platforms
【24h】

Efficient Identification of Local Keyword Patterns in Microblogging Platforms

机译:有效识别微博平台中的本地关键字模式

获取原文
获取原文并翻译 | 示例

摘要

Microblogging platforms, such as Twitter, serve as an important and efficient channel for sharing information. With the prevalence of geo-position enabled devices, a rapidly growing amount of microblogs are associated with geo-tags. Consequently, real-time analysis of the geo-tagged microblog stream has attracted great attentions. In this paper, we advocate the significance of keyword co-occurrence for geo-tagged microblogs analysis, which has been overlooked by existing studies. The co-occurrence of keywords is necessary to resolve the ambiguity in event analysis, especially when different events have overlapping descriptions. Given a geo-tagged microblog stream, we formally define the problem of identifying local (top-K ) maximal frequent keyword co-occurrence patterns over geo-tagged microblog stream, namely LFP (LKFP) query. Given a query region, LFP query aims to retrieve the local maximal keyword patterns with frequency exceeding a given threshold; while LKFP query aims to identify K maximal keyword patterns with highest local frequency, in case users do not have a threshold in mind. To handle the high volume microblog stream and meet the requirement when a large number of queries are issued, we develop novel data structures to maintain the data stream, and propose efficient algorithms to process LFP and LKFP queries with theoretical underpinnings. The extensive empirical study on real dataset confirms the effectiveness and efficiency of our approaches.
机译:微博平台(例如Twitter)是共享信息的重要且有效的渠道。随着支持地理位置的设备的普及,与地理标签相关的微博数量迅速增长。因此,对带有地理标记的微博客流的实时分析引起了极大的关注。在本文中,我们提倡关键字共现在具有地理标签的微博分析中的重要性,而现有研究已忽略了它的重要性。关键字的同时出现对于解决事件分析中的歧义是必要的,尤其是当不同事件的描述重叠时。给定带有地理标签的微博流,我们正式定义了在具有地理标签的微博流(即LFP(LKFP)查询)上识别本地(top-K)最大频繁关键字共现模式的问题。在给定查询区域的情况下,LFP查询旨在检索频率超过给定阈值的局部最大关键字模式;而LKFP查询的目的是在用户没有阈值的情况下,识别出具有最高本地频率的K个最大关键字模式。为了处理大量微博客流并满足发出大量查询的要求,我们开发了新颖的数据结构来维护数据流,并提出了有效的算法来处理具有理论基础的LFP和LKFP查询。对真实数据集的广泛实证研究证实了我们方法的有效性和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号