An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition

Klesti Hoxha; Artur Baxhaku

首页> 外文期刊>Cybernetics and information technologies: CIT >An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition

【24h】

An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition

机译：用于阿尔巴尼亚人命名实体识别的自动生成的注释语料库

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Named Entity Recognition (NER) is an important task in many NLPpipelines. It has become especially important for knowledge bases that power manyof the nowadays information retrieval systems. In order to cope with the highdemand for annotated training corpora for supervised NER systems, automaticgeneration approaches have been proposed. In this paper we report on the firstautomatically generated NE annotated corpus for Albanian. News articles fromAlbanian news media were used as a document source. They were automaticallytagged using a custom generated gazetteer from the Albanian Wikipedia. Ourevaluation results show that this corpus can be used as a baseline corpus for humanannotated ones or as a training corpus where no other is available.

机译：命名实体识别（ner）是许多NLPPipelines中的重要任务。对于当今信息检索系统的许多信息来看，它对知识库尤为重要。为了应对监督NER系统的注释培训Corpora的HighdedMand，已经提出了自动化方法。在本文中，我们向阿尔巴尼亚人报告了Firstautomay生成的NE注释语料库。新闻文章从哈尔巴尼亚新闻媒体用作文件来源。他们是使用来自阿尔巴尼亚维基百科的自定义生殖的瞪羚自动标记。 OureSuituation结果表明，该语料库可用作人类annotated毒性的基线语料库或作为培训语料库，在没有其他可用的地方。

著录项

来源
《Cybernetics and information technologies: CIT》 |2018年第1期|共14页
作者
Klesti Hoxha; Artur Baxhaku;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类自动信息理论;
关键词

相似文献

外文文献
中文文献
专利

1. Automatically building large-scale named entity recognition corpora from Chinese Wikipedia [J] . Jie ZHOU, Bi-cheng LI, Gang CHEN 浙江大学学报（英文版）（C辑：计算机与电子） . 2015,第011期
2. Cybersecurity Named Entity Recognition Using Bidirectional Long Short-Term Memory with Conditional Random Fields [J] . Pingchuan Ma, Bo Jiang, Zhigang Lu, 清华大学学报（英文版） . 2021,第003期
3. Data and knowledge-driven named entity recognition for cyber security [J] . Chen Gao, Xuan Zhang, Hui Liu 网络空间安全科学与技术（英文版） . 2021,第002期
4. Positive unlabeled named entity recognition with multi-granularity linguistic information [J] . Ouyang Xiaoye, Chen Shudong, Wang Rong 高技术通讯（英文版） . 2021,第004期
5. An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition [J] . Klesti Hoxha, Artur Baxhaku Cybernetics and information technologies: CIT . 2017,第1期

机译：用于阿尔巴尼亚命名实体识别的自动生成的带注释语料库
6. Development of a Hindi Named Entity Recognition System without Using Manually Annotated Training Corpus [J] . Saha Sujan Kumar, Majumder Mukta The international arab journal of information technology . 2018,第6期

机译：不使用人工注释的训练语料库的印地语命名实体识别系统的开发
7. Assessment of disease named entity recognition on a corpus of annotated sentences [J] . Antonio Jimeno, Ernesto Jimenez-Ruiz, Vivian Lee, BMC Bioinformatics . 2008,第SUPPLEMENTa3期

机译：在带注释句子的语料库上评估疾病命名实体识别
8. Named Entity Recognition for Icelandic: Annotated Corpus and Models [C] . Svanhvft L. Ingolfsdottir, Asmundur A. Guðjonsson, Hrafn Loftsson International Conference on Statistical Language and Speech Processing . 2020

机译：为冰岛命名的实体识别：注释语料库和模型
9. Arabic Named Entity Recognition: A Corpus-Based Study [D] . Algahtani, Shabib. 2012

机译：阿拉伯语命名实体识别：基于语料库的研究
10. Assessment of disease named entity recognition on a corpus of annotated sentences [O] . Antonio Jimeno, Ernesto Jimenez-Ruiz, Vivian Lee, 2008

机译：在带注释句子的语料库上评估疾病命名实体识别
11. Automatic Creation of Arabic Named Entity Annotated Corpus Using Wikipedia. [O] . Althobaiti Maha, Kruschwitz Udo, Poesio Massimo 2014

机译：使用Wikipedia自动创建带阿拉伯文名称的带注释的语料库。
12. Named Entity Recognition as a House of Cards: Classifier Stacking [R] . Florian, R. 2002

机译：命名实体识别作为纸牌屋：分类器堆叠

An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition

摘要

著录项

相似文献

相关主题

期刊订阅