Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions

机译：相同的指称词，不同的词：不透明的不透明指称词的挖掘

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Coreference resolution systems rely heavily on string overlap (e.g., Google Inc. and Google), performing badly on mentions with very different words (opaque mentions) like Google and the search giant. Yet prior attempts to resolve opaque pairs using ontologies or distributional semantics hurt precision more than improved recall. We present a new unsupervised method for mining opaque pairs. Our intuition is to restrict distributional semantics to articles about the same event, thus promoting referential match. Using an English comparable corpus of tech news, we built a dictionary of opaque coreferent mentions (only 3% are in WordNet). Our dictionary can be integrated into any coreference system (it increases the performance of a state-of-the-art system by 1% F1 on all measures) and is easily extendable by using news aggregators.

机译：共指解析系统在很大程度上依赖于字符串重叠（例如Google Inc.和Google），在使用与Google和搜索巨头一样截然不同的单词（不透明的提及）的提及中表现不佳。然而，先前使用本体论或分布语义来解决不透明对的尝试对准确性的影响远大于对回忆的改善。我们提出了一种新的无监督方法来挖掘不透明对。我们的直觉是将分发语义限制为有关同一事件的文章，从而促进引用匹配。使用英语类似的技术新闻语料库，我们构建了不透明的核心提及字典（WordNet中只有3％）。我们的词典可以集成到任何共同引用系统中（在所有度量标准上，它都可以使最新系统的性能提高1％F1），并且可以通过使用新闻聚合器轻松地进行扩展。

著录项

来源
《Conference of the North American Chapter of the Association for Computational Linguistics: human language technologies》|2013年|897-906|共10页
会议地点
作者
Marta Recasens; Matthew Can; Dan Jurafsky;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 14:27:48

相似文献

外文文献
中文文献
专利

1. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features [J] . Nikfarjam Azadeh, Sarker Abeed, OConnor Karen, Journal of the American Medical Informatics Association : . 2015,第3期

机译：来自社交媒体的药物警戒：使用带有词嵌入簇特征的序列标签来挖掘药物不良反应提及
2. An Unsupervised Graph Based Continuous Word Representation Method for Biomedical Text Mining [J] . Zhenchao Jiang, Lishuang Li, Degen Huang IEEE/ACM transactions on computational biology and bioinformatics . 2016,第4期

机译：基于无监督图的生物医学文本挖掘连续词表示方法
3. Unsupervised Opinion Mining From Text Reviews Using SentiWordNet [J] . Vibha Soni, Meenakshi R Patel International Journal of Computer Trends and Technology . 2014,第5期

机译：使用SentiWordNet从文本评论中进行无监督意见挖掘
4. Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions [C] . Marta Recasens, Matthew Can, Dan Jurafsky Conference of the North American Chapter of the Association for Computational Linguistics: human language technologies . 2013

机译：同样的指点，不同的词：不经过透明的矿石的挖掘
5. Effect of context on recall of self-referent and group-referent words in individuals from individualistic or collectivistic cultures. [D] . D'Urso, Nadia. 2003

机译：语境对个人主义或集体主义文化个体的自我指称和群体指称单词的记忆力的影响。
6. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features [O] . Azadeh Nikfarjam, Abeed Sarker, Karen O’Connor, 2015

机译：来自社交媒体的药物警戒：使用带有词嵌入簇特征的序列标签挖掘不良药物反应提及
7. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features [O] . Azadeh Nikfarjam, Abeed Sarker, Karen O’Connor, 2015

机译：来自社交媒体的药物：采矿不良药物反应使用序列标记与单词嵌入簇特征

Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions

摘要

著录项

相似文献

相关主题

期刊订阅