首页> 外文会议>Conference of the European Chapter of the Association for Computational Linguistics >Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings
【24h】

Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings

机译:使用Word Embeddings的非参数贝叶斯混合模型从Twitter提取

获取原文

摘要

To extract structured representations of newsworthy events from Twitter, unsuper vised models typically assume that tweets involving the same named entities and ex pressed using similar words are likely to belong to the same event. Hence, they group tweets into clusters based on the co occurrence patterns of named entities and topical keywords. However, there are two main limitations. First, they require the number of events to be known beforehand, which is not realistic in practical applica tions. Second, they don't recognise that the same named entity might be referred to by multiple mentions and tweets us ing different mentions would be wrongly assigned to different events. To over come these limitations, we propose a non-parametric Bayesian mixture model with word embeddings for event extraction, in which the number of events can be in ferred automatically and the issue of lex ical variations for the same named entity can be dealt with properly. Our model has been evaluated on three datasets with sizes ranging between 2,499 and over 60 million tweets. Experimental results show that our model outperforms the baseline approach on all datasets by 5-8% in F-measure.
机译:为了从Twitter中提取新闻处事件的结构化表示,不愿意的模型通常假设涉及相同命名实体和使用类似单词按下的推文可能属于同一事件。因此,它们基于命名实体的CO发生模式和主题关键字的CO发生模式将推断为集群。但是,有两个主要限制。首先,他们需要事先已知的事件数量,这在实际应用中是现实的。其次,他们不认识到相同的命名实体可能会被多次提到和推文的推文错误分配给不同的事件。要过度来,我们提出了一个非参数贝叶斯混合模型,其中包含用于事件提取的单词嵌入,其中事件的数量可以自动进行,并且可以正确处理相同命名实体的Lex ICAL变化的问题。我们的模型已经在三个数据集中进行了评估,其中尺寸在2,499和超过6000万次推文之间。实验结果表明,我们的模型在F测量中占据了所有数据集的基线方法5-8%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号