Improving the Performance of a Named Entity Extractor by Applying a Stacking Scheme

机译：通过应用堆叠方案提高命名实体提取器的性能

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper we investigate the way of improving the performance of a Named Entity Extraction (NEE) system by applying machine learning techniques and corpus transformation. The main resources used in our experiments are the publicly available tagger TnT and a corpus of Spanish texts in which named entities occurrences are tagged with BIO tags. We split the NEE task into two subtasks 1) Named Entity Recognition (NER) that involves the identification of the group of words that make up the name of an entity and 2) Named Entity Classification (NEC) that determines the category of a named entity. We have focused our work on the improvement of the NER task, generating four different taggers with the same training corpus and combining them using a stacking scheme. We improve the baseline of the NER task (F_(β=i) value of 81.84) up to a value of 88.37. When a NEC module is added to the NER system the performance of the whole NEE task is also improved. A value of 70.47 is achieved from a baseline of 66.07.

机译：在本文中，我们研究了通过应用机器学习技术和语料库转换来提高命名实体提取（NEE）系统性能的方法。我们的实验中使用的主要资源是公开可用的标记器TnT和西班牙语文本的语料库，其中用BIO标签标记了命名实体的出现。我们将NEE任务分为两个子任务：1）命名实体识别（NER），涉及识别组成实体名称的一组单词； 2）命名实体分类（NEC），用于确定命名实体的类别。我们将工作重点放在了NER任务的改进上，用相同的训练语料生成了四个不同的标记器，并使用堆叠方案将它们组合在一起。我们将NER任务的基线（F_（β= i）值为81.84）提高到88.37。当将NEC模块添加到NER系统时，整个NEE任务的性能也会得到改善。从66.07的基线获得70.47的值。

著录项

来源
《Ibero-American Conference on AI(IBERAMIA 2004); 20041122-26; Puebla(IT)》|2004年|P.295-304|共10页
会议地点 Puebla(IT)
作者
Jose A. Troyano; Victor J. Diaz; Fernando Enriquez; Luisa Romero;
展开▼
作者单位

Department of Languages and Computer Systems, University of Seville, Av. Reina Mercedes s 41012, Sevilla (Spain);

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. APPLYING MACHINE LEARNING FOR HIGH-PERFORMANCE NAMED-ENTITY EXTRACTION [J] . SHUMEET BALUJA, VIBHU O. MITTAL, RAHUL SUKTHANKAR Computational Intelligence . 2000,第4期

机译：应用机器学习进行高性能命名实体提取
2. Massive parallel sequencing uncovers actionable FGFR2–PPHLN1 fusion and ARAF mutations in intrahepatic cholangiocarcinoma [J] . Daniela Sia, Bojan Losic, Agrin Moeini, Nature Communications . 2015,第1期

机译：大规模并行测序发现可行的 FGFR2 – PPHLN1 融合和 <肝内胆管癌的named-entity> ARAF 突变
3. Dppa3 expression is critical for generation of fully reprogrammed iPS cells and maintenance of Dlk1-Dio3 imprinting [J] . Xingbo Xu, Lukasz Smorag, Toshinobu Nakamura, Nature Communications . 2015,第2016期

机译： Dppa3 表达对于生成完全重新编程的iPS细胞和维护 Dlk1 - Dio3 印记
4. Improving the Performance of a Named Entity Extractor by Applying a Stacking Scheme [C] . Jose A. Troyano, Victor J. Diaz, Fernando Enriquez, Ibero-American Conference on AI . 2004

机译：通过应用堆叠方案来提高命名实体提取器的性能
5. Improving Search via Named Entity Recognition in Morphologically Rich Languages: A Case Study in Urdu [D] . Riaz, Kashif H. 2018

机译：通过形态丰富的语言中的命名实体识别来改善搜索：以乌尔都语为例
6. Precursor-induced conditional random fields: connecting separate entities by induction for improved clinical named entity recognition [O] . Wangjin Lee, Jinwook Choi 2019

机译：前体诱导的条件随机场：通过诱导连接单独的实体以改善临床命名实体的识别
7. Improving the performance of a Named Entity Extractor by applying a Stacking Scheme [O] . José A. Troyano, Víctor J. Díaz, O Enríquez, 2008

机译：通过应用堆叠方案来提高命名实体提取器的性能
8. Named Entity Recognition as a House of Cards: Classifier Stacking [R] . Florian, R. 2002

机译：命名实体识别作为纸牌屋：分类器堆叠

Improving the Performance of a Named Entity Extractor by Applying a Stacking Scheme

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅