首页> 外文会议>9th International conference on language resources and evaluation >Multilingual corpora with coreferential annotation of person entities
【24h】

Multilingual corpora with coreferential annotation of person entities

机译:多语语料库,具有人实体的核心标注

获取原文

摘要

This paper presents three corpora with coreferential annotation of person entities for Portuguese, Galician and Spanish. They contain coreference links between several types of pronouns (including elliptical, possessive, indefinite, demonstrative, relative and personal clitic and non-clitic pronouns) and nominal phrases (including proper nouns). Some statistics have been computed, showing distributional aspects of coreference both in journalistic and in encyclopedic texts. Furthermore, the paper shows the importance of coreference resolution for a task such as Information Extraction, by evaluating the output of an Open Information Extraction system on the annotated corpora. The corpora are freely distributed in two formats: (ⅰ) the SemEval-2010 and (ⅱ) the brat rapid annotation tool, so they can be enlarged and improved collaboratively.
机译:本文提出了具有葡萄牙语,加利西亚语和西班牙语人称实体标注的三个语料库。它们包含几种类型的代词(包括椭圆,所有格,不定式,指示性,相对和个人的气候和非气候代词)与名词短语(包括专有名词)之间的共指链接。已经计算了一些统计数据,显示了新闻和百科全书中共指的分布方面。此外,通过评估带注释的语料库上的开放信息提取系统的输出,本文显示了共引用解析对于诸如信息提取之类的任务的重要性。该语料库以两种格式自由分发:(ⅰ)SemEval-2010和(ⅱ)brat快速注释工具,因此可以协作扩大和改进它们。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号