...
首页> 外文期刊>Journal of the American Society for Information Science and Technology >Automatic Event Detection in Microblogs Using Incremental Machine Learning
【24h】

Automatic Event Detection in Microblogs Using Incremental Machine Learning

机译:使用增量机器学习的微博中自动事件检测

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The global popularity of microblogs has led to an increasing accumulation of large volumes of text data on microblogging platforms such as Twitter. These corpora are untapped resources to understand social expressions on diverse subjects. Microblog analysis aims to unlock the value of such expressions by discovering insights and events of significance hidden among swathes of text. Besides velocity; diversity of content, brevity, absence of structure and time-sensitivity are key challenges in microblog analysis. In this paper, we propose an unsupervised incremental machine learning and event detection technique to address these challenges. The proposed technique separates a microblog discussion into topics to address the key problem of diversity. It maintains a record of the evolution of each topic over time. Brevity, time-sensitivity and unstructured nature are addressed by these individual topic pathways which contribute to generate a temporal, topic-driven structure of a microblog discussion. The proposed event detection method continuously monitors these topic pathways using multiple domain-independent event indicators for events of significance. The autonomous nature of topic separation, topic pathway generation, new topic identification and event detection, appropriates the proposed technique for extensive applications in microblog analysis. We demonstrate these capabilities on tweets containing Microsoft and tweets containing #obama.
机译:微博在全球的普及已导致在诸如Twitter之类的微博平台上积累了大量的文本数据。这些语料库是尚未开发的资源,可用于理解各种主题上的社交表达。微博客分析旨在通过发现隐藏在大量文本中的见解和重要事件来释放此类表达的价值。除了速度内容的多样性,简洁性,结构的缺乏和时间敏感性是微博分析中的关键挑战。在本文中,我们提出了一种无监督的增量式机器学习和事件检测技术来应对这些挑战。所提出的技术将微博讨论分为主题,以解决多样性的关键问题。它保留了每个主题随时间变化的记录。这些个体话题途径解决了简短,时间敏感性和非结构化的本质,这些途径有助于生成微博讨论的时间性,话题驱动的结构。所提出的事件检测方法使用多个与域无关的事件指示符来连续监视这些主题途径,以进行重要事件。主题分离,主题路径生成,新主题识别和事件检测的自主性质使所建议的技术适合于微博分析中的广泛应用。我们在包含Microsoft的推文和包含#obama的推文上演示了这些功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号