Web-Based Sources for an Annotated Corpus Building and Composite Proper Name Identification

机译：带注释的语料库构建和复合专有名称识别的基于Web的源

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Nowadays, collections of texts with annotations on several levels are useful resources. Huge efforts are required to develop this resource for languages like Spanish. In this work, we present the initial step, lexical level annotation, for the compilation of an annotated Mexican corpus using Web-based sources. We also describe a method based on heterogeneous knowledge and simple Web-based sources for the proper name identification required in such annotation. We focused our work on composite entities (names with coordinated constituents, names with several prepositional phrases, and names of songs, books, movies, etc.). The preliminary obtained results are presented.

机译：如今，具有多个级别注释的文本集合是有用的资源。开发诸如西班牙语之类的语言的资源需要付出巨大的努力。在这项工作中，我们介绍了第一步，即词汇级注释，用于使用基于Web的源代码编译带注释的墨西哥语料库。我们还描述了一种基于异类知识和基于Web的简单来源的方法，用于在此类注释中标识正确的名称。我们将工作重点放在了复合实体（具有协调的成分的名称，具有多个介词短语的名称以及歌曲，书籍，电影等的名称）上。初步获得了结果。

著录项

来源
《International Atlantic Web Intelligence Conference(AWIC 2004); 20040516-20040519; Cancun; MX》|2004年|P.115-124|共10页
会议地点 Cancun(MX);Cancun(MX)
作者
Sofia N. Galicia-Haro; Alexander Gelbukh; Igor A. Bolshakov;
展开▼
作者单位

Faculty of Sciences UNAM Ciudad Universitaria Mexico City, Mexico;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机网络;
关键词

相似文献

外文文献
中文文献
专利

1. The Pathogen-annotated Tracking Resource Network (PATRN) system: A web-based resource to aid food safety, regulatory science, and investigations of foodborne pathogens and disease [J] . G. Gopinath, K. Hari, R.Jain, Food microbiology . 2013,第2期

机译：带病原体注释的跟踪资源网络（PATRN）系统：一种基于Web的资源，可帮助食品安全，监管科学以及食源性病原体和疾病的调查
2. Building a semantically annotated corpus for chronic disease complications using two document types [J] . Noha Alnazzawi PLoS One . 2021,第3期

机译：使用两种文档类型构建用于慢性疾病并发症的语义注释的语料
3. Building semantically annotated corpus for text classification of Indian defence news articles [J] . aurabh A. Kanekar, Alind Sharma, Gaurang S. Patkar, International Journal of Information Technology . 2021,第4期

机译：建立语义注释的印度国防新闻文本分类语料库
4. Web-Based Sources for an Annotated Corpus Building and Composite Proper Name Identification [C] . Sofia N. Galicia-Haro, Alexnader Gelbukh, Igor A. Bolshakov International Atlantic Web Intelligence Conference . 2004

机译：基于Web的注释语料库建设和复合专业名称识别的源
5. Annotating a corpus of biomedical research texts: Two models of rhetorical analysis. [D] . White, Barbara Ellen. 2010

机译：注释生物医学研究文献集：修辞分析的两种模型。
6. Building a semantically annotated corpus for chronic disease complications using two document types [O] . Noha Alnazzawi 2021

机译：使用两种文件类型构建用于慢性疾病并发症的语义注释的语料
7. Web-Based Sources for an Annotated Corpus Building and Composite Proper Name Identification [O] . Sofía N. Galicia-haro, Er Gelbukh, Igor A. Bolshakov 2013

机译：基于Web的注释语料库建立源和复合专有名称识别

Web-Based Sources for an Annotated Corpus Building and Composite Proper Name Identification

摘要

著录项

相似文献

相关主题

期刊订阅