【24h】

Profile Based Cross-Document Coreference Using Kernelized Fuzzy Relational Clustering

机译:基于核模糊关系聚类的基于配置文件的跨文档共指

获取原文
获取外文期刊封面目录资料

摘要

Coreferencing entities across documents in a large corpus enables advanced document understanding tasks such as question answering. This paper presents a novel cross document coreference approach that leverages the profiles of entities which are constructed by using information extraction tools and reconciled by using a within-document coreference module. We propose to match the profiles by using a learned ensemble distance function comprised of a suite of similarity specialists. We develop a kernelized soft relational clustering algorithm that makes use of the learned distance function to partition the entities into fuzzy sets of identities. We compare the kernelized clustering method with a popular fuzzy relation clustering algorithm (FRC) and show 5% improvement in coreference performance. Evaluation of our proposed methods on a large benchmark disambiguation collection shows that they compare favorably with the top runs in the SemEval evaluation.
机译:大型语料库中的文档之间的实体引用可以实现高级文档理解任务,例如问题解答。本文提出了一种新颖的跨文档共引用方法,该方法利用了通过使用信息提取工具构建并通过使用文档内共引用模块进行协调的实体的配置文件。我们建议通过使用由一组相似专家组成的学习的集成距离函数来匹配配置文件。我们开发了一种内核化的软关系聚类算法,该算法利用学习的距离函数将实体划分为身份的模糊集合。我们将核化的聚类方法与流行的模糊关系聚类算法(FRC)进行了比较,并显示出共参考性能提高了5%。在大型基准消歧集合上对我们提出的方法进行的评估表明,它们与SemEval评估中的最佳方法相比具有优势。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号