首页> 外文期刊>IETE Journal of Research >Unsupervised Event Detection Using Self-learning-based Max-margin Clustering: Analysis on Streaming Tweets
【24h】

Unsupervised Event Detection Using Self-learning-based Max-margin Clustering: Analysis on Streaming Tweets

机译:使用基于自学习的MAX-MARIN群集的无监督事件检测:流式推文的分析

获取原文
获取原文并翻译 | 示例
           

摘要

We propose an unsupervised approach for tweet clustering from large-scale Twitter repository in this paper. The amount of acquired data from streaming media like Twitter is vast in nature. They contain readily available information regarding important events taking place during the time span. Hence, it is indeed difficult to deploy supervised learning strategies for analyzing the tweets for meaningful information extraction. On top of that, the tweets are unstructured in nature given the diversities of the end-users who put the tweets. Given that, an unsupervised tweet-processing technique can be of immense help for different inference tasks including event extraction, sentiment analysis, to name a few. Based on the aforementioned bottlenecks of the majority of the existing techniques, we propose a novel unsupervised event detection strategy from streaming tweets. In this regard, we propose a self-learning max-margin clustering which deploys the notion of SVM in an unsupervised setup. We evaluate proposed system and compare it with the popular techniques from the literature using 6.5 million streaming tweets, collected in June 2017. In our experiments, self-learning-based max-margin clustering outperforms the techniques of literature in terms of precision, Silhouette score, and Calinski-Harabasz score.
机译:我们在本文中提出了一种从大型Twitter存储库的推文聚类的无人监督方法。像Twitter这样的流媒体的获取数据的数量是大自然的。它们含有关于在时间跨度期间发生的重要事件的可用信息。因此,部署监督学习策略确实很难分析有意义信息提取的推文。首先,鉴于将推文的最终用户的多样性,推文本质上是非结构化的。鉴于此,对于包括事件提取,情感分析的不同推理任务,令人难过的推文处理技术可以是巨大的帮助,包括事件提取,情绪分析。基于上述瓶颈的大多数现有技术,我们提出了一种从流媒体推文中提出了一种新颖的无监督事件检测策略。在这方面,我们提出了一种自学习的MAX-MARIN群集,该群集在无监督的设置中部署SVM的概念。我们评估提出的系统,并将其与文献中的流行技术与在2017年6月收集的650万流扬声器中。在我们的实验中,基于自学习的MAX-MARING聚类优于精确,轮廓分数的文学技术,和calinski-harabasz得分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号