首页> 美国卫生研究院文献>Computational Intelligence and Neuroscience >Exploiting Language Models to Classify Events from Twitter
【2h】

Exploiting Language Models to Classify Events from Twitter

机译:利用语言模型对Twitter中的事件进行分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Classifying events is challenging in Twitter because tweets texts have a large amount of temporal data with a lot of noise and various kinds of topics. In this paper, we propose a method to classify events from Twitter. We firstly find the distinguishing terms between tweets in events and measure their similarities with learning language models such as ConceptNet and a latent Dirichlet allocation method for selectional preferences (LDA-SP), which have been widely studied based on large text corpora within computational linguistic relations. The relationship of term words in tweets will be discovered by checking them under each model. We then proposed a method to compute the similarity between tweets based on tweets' features including common term words and relationships among their distinguishing term words. It will be explicit and convenient for applying to k-nearest neighbor techniques for classification. We carefully applied experiments on the Edinburgh Twitter Corpus to show that our method achieves competitive results for classifying events.
机译:在Twitter中对事件进行分类是具有挑战性的,因为推文文本具有大量的时态数据,并带有大量噪音和各种主题。在本文中,我们提出了一种对Twitter事件进行分类的方法。我们首先找到事件中推文之间的区别性术语,并使用诸如ConceptNet和针对选择偏好的潜在狄利克雷分配方法(LDA-SP)等学习语言模型来衡量它们之间的相似性,这些模型已在计算语言关系中基于大型文本语料库进行了广泛研究。通过检查每种模型下的推文,可以发现推文中术语词的关系。然后,我们提出了一种基于推文的特征来计算推文之间相似度的方法,这些特征包括常用术语词及其区别词之间的关系。将其应用于k最近邻技术进行分类将是明确且方便的。我们在爱丁堡Twitter语料库上仔细地应用了实验,以表明我们的方法在分类事件方面取得了竞争性的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号