首页> 外文会议>Conference of the European Chapter of the Association for Computational Linguistics >Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings

【24h】

Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings

机译：使用Word Embeddings的非参数贝叶斯混合模型从Twitter提取

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

To extract structured representations of newsworthy events from Twitter, unsuper vised models typically assume that tweets involving the same named entities and ex pressed using similar words are likely to belong to the same event. Hence, they group tweets into clusters based on the co occurrence patterns of named entities and topical keywords. However, there are two main limitations. First, they require the number of events to be known beforehand, which is not realistic in practical applica tions. Second, they don't recognise that the same named entity might be referred to by multiple mentions and tweets us ing different mentions would be wrongly assigned to different events. To over come these limitations, we propose a non-parametric Bayesian mixture model with word embeddings for event extraction, in which the number of events can be in ferred automatically and the issue of lex ical variations for the same named entity can be dealt with properly. Our model has been evaluated on three datasets with sizes ranging between 2,499 and over 60 million tweets. Experimental results show that our model outperforms the baseline approach on all datasets by 5-8% in F-measure.

机译：为了从Twitter中提取新闻处事件的结构化表示，不愿意的模型通常假设涉及相同命名实体和使用类似单词按下的推文可能属于同一事件。因此，它们基于命名实体的CO发生模式和主题关键字的CO发生模式将推断为集群。但是，有两个主要限制。首先，他们需要事先已知的事件数量，这在实际应用中是现实的。其次，他们不认识到相同的命名实体可能会被多次提到和推文的推文错误分配给不同的事件。要过度来，我们提出了一个非参数贝叶斯混合模型，其中包含用于事件提取的单词嵌入，其中事件的数量可以自动进行，并且可以正确处理相同命名实体的Lex ICAL变化的问题。我们的模型已经在三个数据集中进行了评估，其中尺寸在2,499和超过6000万次推文之间。实验结果表明，我们的模型在F测量中占据了所有数据集的基线方法5-8％。

著录项

来源
《Conference of the European Chapter of the Association for Computational Linguistics 》|2017年|xxxviii p. 643-1280|共10页
会议地点
作者
Deyu Zhou; Xuan Zhangt; Yulan He;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程 ;
关键词

相似文献

外文文献
中文文献
专利

1. Bayesian Non-Parametric Mixture Model with Application to Modeling Biological Markers [J] . Mercy K. Peter, Levi Mbugua, Anthony Wanjoya Journal of Data Analysis and Information Processing . 2019 ,第4期

机译：贝叶斯非参数混合模型及其在生物标记建模中的应用
2. Joint modeling of recurrent events and survival: a Bayesian non-parametric approach [J] . GIORGIO PAULON, MARIA DE IORIO, ALESSANDRA GUGLIELMI, Biostatistics . 2020 ,第1期

机译：复发事件与生存的联合建模：贝叶斯非参数方法
3. Bayesian Non-Parametric Mixtures of GARCH(1,1) Models [J] . John W.Lau, EdCripps Journal of Probability and Statistics . 2012 ,第4期

机译：GARCH（1,1）模型的贝叶斯非参数混合
4. Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings [C] . Deyu Zhou, Xuan Zhangt, Yulan He Conference of the European Chapter of the Association for Computational Linguistics . 2017

机译：使用带有词嵌入的非参数贝叶斯混合模型从Twitter中提取事件
5. Hierarchical Non-Parametric Bayesian Mixture Models and Applications on Big Data [D] . Yerebakan, Halid Ziya 2017

机译：分层非参数贝叶斯混合模型及其在大数据上的应用
6. Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings [O] . Hong-Jie Dai, Chu-Hsien Su, Chi-Shin Wu 2020

机译：通过不同序列标记模型和Word Embedings的级联架构在电子健康记录中的不良药物事件和药物提取
7. Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings [O] . Deyu Zhou, Xuan Zhang, Yulan He 2017

机译：使用Word Embeddings的非参数贝叶斯混合模型从Twitter提取

Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings

摘要

著录项

相似文献

相关主题

期刊订阅