首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >TopicSketch: Real-Time Bursty Topic Detection from Twitter
【24h】

TopicSketch: Real-Time Bursty Topic Detection from Twitter

机译:TopicSketch:来自Twitter的实时突发主题检测

获取原文
获取原文并翻译 | 示例

摘要

Twitter has become one of the largest microblogging platforms for users around the world to share anything happening around them with friends and beyond. A bursty topic in Twitter is one that triggers a surge of relevant tweets within a short period of time, which often reflects important events of mass interest. How to leverage Twitter for early detection of bursty topics has therefore become an important research problem with immense practical value. Despite the wealth of research work on topic modelling and analysis in Twitter, it remains a challenge to detect bursty topics in real-time. As existing methods can hardly scale to handle the task with the tweet stream in real-time, we propose in this paper , a sketch-based topic model together with a set of techniques to achieve real-time detection. We evaluate our solution on a tweet stream with over 30 million tweets. Our experiment results show both efficiency and effectiveness of our approach. Especially it is also demonstrated that on a single machine can potentially handle hundreds of millions tweets per day, which is on the same scale of the total number of daily tweets in Twitter, and present bursty events in finer-granularity.
机译:Twitter已成为全球用户最大的微博平台之一,可以与朋友和其他人分享他们周围发生的一切。 Twitter中的一个突发性话题是在短时间内触发相关推文激增的话题,这通常反映出引起人们广泛关注的重要事件。因此,如何利用Twitter来早期发现突发性话题已成为具有巨大实用价值的重要研究问题。尽管在Twitter中进行了大量关于主题建模和分析的研究工作,但实时检测突发性主题仍然是一个挑战。由于现有方法难以扩展以实时处理推文流,因此在本文中,我们提出了一种基于草图的主题模型以及一系列实现实时检测的技术。我们在超过3000万条推文的推文流中评估我们的解决方案。我们的实验结果表明了我们方法的有效性和有效性。尤其是,这还表明,在一台计算机上,每天可能潜在地处理数亿条推文,这与Twitter中每日推文总数的规模相同,并以更细粒度呈现突发事件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号