【24h】

Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions

机译:相同的指称词,不同的词:不透明的不透明指称词的挖掘

获取原文

摘要

Coreference resolution systems rely heavily on string overlap (e.g., Google Inc. and Google), performing badly on mentions with very different words (opaque mentions) like Google and the search giant. Yet prior attempts to resolve opaque pairs using ontologies or distributional semantics hurt precision more than improved recall. We present a new unsupervised method for mining opaque pairs. Our intuition is to restrict distributional semantics to articles about the same event, thus promoting referential match. Using an English comparable corpus of tech news, we built a dictionary of opaque coreferent mentions (only 3% are in WordNet). Our dictionary can be integrated into any coreference system (it increases the performance of a state-of-the-art system by 1% F1 on all measures) and is easily extendable by using news aggregators.
机译:共指解析系统在很大程度上依赖于字符串重叠(例如Google Inc.和Google),在使用与Google和搜索巨头一样截然不同的单词(不透明的提及)的提及中表现不佳。然而,先前使用本体论或分布语义来解决不透明对的尝试对准确性的影响远大于对回忆的改善。我们提出了一种新的无监督方法来挖掘不透明对。我们的直觉是将分发语义限制为有关同一事件的文章,从而促进引用匹配。使用英语类似的技术新闻语料库,我们构建了不透明的核心提及字典(WordNet中只有3%)。我们的词典可以集成到任何共同引用系统中(在所有度量标准上,它都可以使最新系统的性能提高1%F1),并且可以通过使用新闻聚合器轻松地进行扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号