首页> 外文会议>International Atlantic Web Intelligence Conference(AWIC 2004); 20040516-20040519; Cancun; MX >Web-Based Sources for an Annotated Corpus Building and Composite Proper Name Identification
【24h】

Web-Based Sources for an Annotated Corpus Building and Composite Proper Name Identification

机译:带注释的语料库构建和复合专有名称识别的基于Web的源

获取原文
获取原文并翻译 | 示例

摘要

Nowadays, collections of texts with annotations on several levels are useful resources. Huge efforts are required to develop this resource for languages like Spanish. In this work, we present the initial step, lexical level annotation, for the compilation of an annotated Mexican corpus using Web-based sources. We also describe a method based on heterogeneous knowledge and simple Web-based sources for the proper name identification required in such annotation. We focused our work on composite entities (names with coordinated constituents, names with several prepositional phrases, and names of songs, books, movies, etc.). The preliminary obtained results are presented.
机译:如今,具有多个级别注释的文本集合是有用的资源。开发诸如西班牙语之类的语言的资源需要付出巨大的努力。在这项工作中,我们介绍了第一步,即词汇级注释,用于使用基于Web的源代码编译带注释的墨西哥语料库。我们还描述了一种基于异类知识和基于Web的简单来源的方法,用于在此类注释中标识正确的名称。我们将工作重点放在了复合实体(具有协调的成分的名称,具有多个介词短语的名称以及歌曲,书籍,电影等的名称)上。初步获得了结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号