【24h】

Exploring a scalable solution to identifying events in noisy Twitter streams

机译:探索可扩展的解决方案,以识别嘈杂的Twitter流中的事件

获取原文

摘要

The unprecedented use of social media through smartphones and other web-enabled mobile devices has enabled the rapid adoption of platforms like Twitter. Event detection has found many applications on the web, including breaking news identification and summarization. The recent increase in the usage of Twitter during crises has attracted researchers to focus on detecting events in tweets. However, current solutions have focused on static Twitter data. The necessity to detect events in a streaming environment during fast paced events such as a crisis presents new opportunities and challenges. In this paper, we investigate event detection in the context of real-time Twitter streams as observed in real-world crises. We highlight the key challenges in this problem: the informal nature of text, and the high-volume and high-velocity characteristics of Twitter streams. We present a novel approach to address these challenges using single-pass clustering and the compression distance to efficiently detect events in Twitter streams. Through experiments on large Twitter datasets, we demonstrate that the proposed framework is able to detect events in near real-time and can scale to large and noisy Twitter streams.
机译:通过智能手机和其他支持Web的移动设备对社交媒体的空前使用已使Twitter等平台得以迅速采用。事件检测在网络上发现了许多应用程序,包括突发新闻识别和摘要。危机期间Twitter使用率的最近增长吸引了研究人员专注于检测推文中的事件。但是,当前的解决方案专注于静态Twitter数据。在诸如危机之类的快速事件中检测流环境中事件的必要性带来了新的机遇和挑战。在本文中,我们研究了在现实危机中观察到的实时Twitter流环境中的事件检测。我们着重指出此问题中的主要挑战:文本的非正式性质以及Twitter流的高容量和高速度特性。我们提出一种新颖的方法来解决这些挑战,它使用单遍群集和压缩距离来有效检测Twitter流中的事件。通过在大型Twitter数据集上进行的实验,我们证明了所提出的框架能够实时检测事件,并且可以扩展到嘈杂的Twitter大数据流。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号