Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions

机译：同样的指点，不同的词：不经过透明的矿石的挖掘

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Coreference resolution systems rely heavily on string overlap (e.g., Google Inc. and Google), performing badly on mentions with very different words (opaque mentions) like Google and the search giant. Yet prior attempts to resolve opaque pairs using ontologies or distributional semantics hurt precision more than improved recall. We present a new unsupervised method for mining opaque pairs. Our intuition is to restrict distributional semantics to articles about the same event, thus promoting referential match. Using an English comparable corpus of tech news, we built a dictionary of opaque coreferent mentions (only 3% are in WordNet). Our dictionary can be integrated into any coreference system (it increases the performance of a state-of-the-art system by 1% F1 on all measures) and is easily extendable by using news aggregators.

机译：Coreference解析系统严重依赖于字符串重叠（例如，Google Inc.和Google），以非常不同的单词（不透明提到）如谷歌和搜索巨头表现得非常不同。然而，目前尝试使用本体或分布语义来解决不透明对的伤害精度超过改进的召回。我们为采矿不透明对提供了一种新的无人监督方法。我们的直觉是将分布语义限制为关于相同事件的文章，从而促进参考匹配。使用英语与技术新闻的比较语料库，我们构建了不透明的Coreferent提升词典（仅3％在Wordnet中）。我们的字典可以集成到任何Coreference系统中（它在所有措施中增加了1％F1的最先进系统的性能），并且通过使用新闻聚合器可以轻松扩展。

著录项

来源
《Conference of the North American Chapter of the Association for Computational Linguistics: human language technologies》|2013年||共10页
会议地点
作者
Marta Recasens; Matthew Can; Dan Jurafsky;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features [J] . Nikfarjam Azadeh, Sarker Abeed, OConnor Karen, Journal of the American Medical Informatics Association : . 2015,第3期

机译：来自社交媒体的药物警戒：使用带有词嵌入簇特征的序列标签来挖掘药物不良反应提及
2. An Unsupervised Graph Based Continuous Word Representation Method for Biomedical Text Mining [J] . Zhenchao Jiang, Lishuang Li, Degen Huang IEEE/ACM transactions on computational biology and bioinformatics . 2016,第4期

机译：基于无监督图的生物医学文本挖掘连续词表示方法
3. Unsupervised Opinion Mining From Text Reviews Using SentiWordNet [J] . Vibha Soni, Meenakshi R Patel International Journal of Computer Trends and Technology . 2014,第5期

机译：使用SentiWordNet从文本评论中进行无监督意见挖掘
4. Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions [C] . Marta Recasens, Matthew Can, Dan Jurafsky Conference of the North American Chapter of the Association for Computational Linguistics: human language technologies . 2013

机译：相同的指称词，不同的词：不透明的不透明指称词的挖掘
5. Effect of context on recall of self-referent and group-referent words in individuals from individualistic or collectivistic cultures. [D] . D'Urso, Nadia. 2003

机译：语境对个人主义或集体主义文化个体的自我指称和群体指称单词的记忆力的影响。
6. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features [O] . Azadeh Nikfarjam, Abeed Sarker, Karen O’Connor, 2015

机译：来自社交媒体的药物警戒：使用带有词嵌入簇特征的序列标签挖掘不良药物反应提及
7. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features [O] . Azadeh Nikfarjam, Abeed Sarker, Karen O’Connor, 2015

机译：来自社交媒体的药物：采矿不良药物反应使用序列标记与单词嵌入簇特征

Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions

摘要

著录项

相似文献

相关主题

期刊订阅