【24h】

Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions

机译:同样的指点,不同的词:不经过透明的矿石的挖掘

获取原文

摘要

Coreference resolution systems rely heavily on string overlap (e.g., Google Inc. and Google), performing badly on mentions with very different words (opaque mentions) like Google and the search giant. Yet prior attempts to resolve opaque pairs using ontologies or distributional semantics hurt precision more than improved recall. We present a new unsupervised method for mining opaque pairs. Our intuition is to restrict distributional semantics to articles about the same event, thus promoting referential match. Using an English comparable corpus of tech news, we built a dictionary of opaque coreferent mentions (only 3% are in WordNet). Our dictionary can be integrated into any coreference system (it increases the performance of a state-of-the-art system by 1% F1 on all measures) and is easily extendable by using news aggregators.
机译:Coreference解析系统严重依赖于字符串重叠(例如,Google Inc.和Google),以非常不同的单词(不透明提到)如谷歌和搜索巨头表现得非常不同。然而,目前尝试使用本体或分布语义来解决不透明对的伤害精度超过改进的召回。我们为采矿不透明对提供了一种新的无人监督方法。我们的直觉是将分布语义限制为关于相同事件的文章,从而促进参考匹配。使用英语与技术新闻的比较语料库,我们构建了不透明的Coreferent提升词典(仅3%在Wordnet中)。我们的字典可以集成到任何Coreference系统中(它在所有措施中增加了1%F1的最先进系统的性能),并且通过使用新闻聚合器可以轻松扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号