WiNER: A Wikipedia Annotated Corpus for Named Entity Recognition

机译：WiNER：用于命名实体识别的维基百科注释语料库

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We revisit the idea of mining Wikipedia in order to generate named-entity annotations. We propose a new methodology that we applied to the English Wikipedia to build WiNER, a large, high quality, annotated corpus. We evaluate its usefulness on 6 NER tasks, comparing 4 popular state-of-the art approaches. We show that lstm-crf is the approach that benefits the most from our corpus. We report impressive gains with this model when using a small portion of WiNER on top of the CONLL training material. Last, we propose a simple but efficient method for exploiting the full range of WiNER, leading to further improvements.

机译：我们重新审视了挖掘Wikipedia的想法，以生成命名实体注释。我们提出了一种适用于英语维基百科的新方法，用于构建WiNER（大型，高质量，带注释的语料库）。我们比较了4种流行的最新方法，评估了它在6个NER任务中的有用性。我们证明了lstm-crf是从我们的语料库中受益最大的方法。当在CONLL培训材料上使用一小部分WiNER时，我们报告此模型取得了令人瞩目的成就。最后，我们提出了一种简单而有效的方法来利用WiNER的全部范围，从而带来进一步的改进。

著录项

来源
《International joint conference on natural language processing》|2017年|413-422|共10页
会议地点
作者
Abbas Ghaddar; Philippe Langlais;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Development of a Hindi Named Entity Recognition System without Using Manually Annotated Training Corpus [J] . Saha Sujan Kumar, Majumder Mukta The international arab journal of information technology . 2018,第6期

机译：不使用人工注释的训练语料库的印地语命名实体识别系统的开发
2. An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition [J] . Klesti Hoxha, Artur Baxhaku Cybernetics and information technologies: CIT . 2017,第1期

机译：用于阿尔巴尼亚命名实体识别的自动生成的带注释语料库
3. Assessment of disease named entity recognition on a corpus of annotated sentences [J] . Antonio Jimeno, Ernesto Jimenez-Ruiz, Vivian Lee, BMC Bioinformatics . 2008,第SUPPLEMENTa3期

机译：在带注释句子的语料库上评估疾病命名实体识别
4. WiNER: A Wikipedia Annotated Corpus for Named Entity Recognition [C] . Abbas Ghaddar, Philippe Langlais International joint conference on natural language processing . 2017

机译：Winer：Wikipedia被指定实体识别的注释语料库
5. Arabic Named Entity Recognition: A Corpus-Based Study [D] . Algahtani, Shabib. 2012

机译：阿拉伯语命名实体识别：基于语料库的研究
6. Assessment of disease named entity recognition on a corpus of annotated sentences [O] . Antonio Jimeno, Ernesto Jimenez-Ruiz, Vivian Lee, 2008

机译：在带注释句子的语料库上评估疾病命名实体识别
7. Automatic Creation of Arabic Named Entity Annotated Corpus Using Wikipedia. [O] . Althobaiti Maha, Kruschwitz Udo, Poesio Massimo 2014

机译：使用Wikipedia自动创建带阿拉伯文名称的带注释的语料库。

WiNER: A Wikipedia Annotated Corpus for Named Entity Recognition

摘要

著录项

相似文献

相关主题

期刊订阅