N~3 - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format

机译：N〜3-NLP交换格式中用于命名实体识别和消歧的数据集的集合

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Extracting Linked Data following the Semantic Web principle from unstructured sources has become a key challenge for scientific research. Named Entity Recognition and Disambiguation are two basic operations in this extraction process. One step towards the realization of the Semantic Web vision and the development of highly accurate tools is the availability of data for validating the quality of processes for Named Entity Recognition and Disambiguation as well as for algorithm tuning. This article presents three novel, manually curated and annotated corpora (N~3). All of them are based on a free license and stored in the NLP Interchange Format to leverage the Linked Data character of our datasets.

机译：遵循语义网原理从非结构化源中提取链接数据已成为科学研究的主要挑战。命名实体识别和消歧是此提取过程中的两个基本操作。实现语义Web愿景和开发高度精确的工具的第一步是提供数据，以验证命名实体识别和歧义消除以及算法调整的过程质量。本文介绍了三种新颖的，手动管理和注释的语料库（N〜3）。它们全部基于免费许可证，并以NLP交换格式存储，以利用我们数据集的链接数据字符。

著录项

来源
《9th International conference on language resources and evaluation》|2014年|4115-4119|共5页
会议地点
作者
Michael Roeder; Ricardo Usbeck; Sebastian Hellmann; Daniel Gerber; Andreas Both;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Datasets; NLP Interchange Format; Named Entity Detection; Named Entity Disambiguation;

机译：数据集NLP交换格式;命名实体检测;命名实体消歧;

相似文献

外文文献
中文文献
专利

1. Exploring entity recognition and disambiguation for cultural heritage collections [J] . van Hooland Seth, De Wilde Max, Verborgh Ruben, Literary & linguistic computing . 2015,第2期

机译：探索实体对文化遗产收藏的认可和消除歧义
2. Biomedical named entity recognition and linking datasets: survey and our recent development [J] . Ming-Siang Huang, Po-Ting Lai, Pei-Yen Lin, Briefings in bioinformatics . 2020,第6期

机译：生物医学命名实体识别和链接数据集：调查和我们最近的发展
3. Interlinking SciGraph and DBpedia Datasets Using Link Discovery and Named Entity Recognition Techniques [J] . Beyza Yaman, Michele Pasin, Markus Freudenberg OASIcs : OpenAccess Series in Informatics . 2019,第1期

机译：使用链接发现和命名实体识别技术互连SciGraph和DBpedia数据集
4. N~3 - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format [C] . Michael Roeder, Ricardo Usbeck, Sebastian Hellmann, 9th International conference on language resources and evaluation . 2014

机译：n〜3 - 用于NLP交换格式的命名实体识别和歧义的数据集集合
5. Learning for information extraction: From named entity recognition and disambiguation to relation extraction. [D] . Bunescu, Razvan Constantin. 2007

机译：学习信息提取：从命名实体识别和歧义消除到关系提取。
6. OryzaGP: rice gene and protein dataset for named-entity recognition [O] . Pierre Larmande, Huy Do, Yue Wang 2019

机译：OryzaGP：水稻基因和蛋白质数据集用于命名实体识别
7. Linguistically Informed Relation Extraction and Neural Architectures for Nested Named Entity Recognition in BioNLP-OST 2019 [O] . Pankaj Gupta, Usama Yaseen, Hinrich Schütze 2019

机译：2019年Bionlp-OST中嵌套命名实体识别的语言信息提取与神经架构

N~3 - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format

摘要

著录项

相似文献

相关主题

期刊订阅