Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition

机译：跨语言命名实体识别的CRF模型的无监督主动学习

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Manual annotation of the training data of information extraction models is a time consuming and expensive process but necessary for the building of information extraction systems. Active learning has been proven to be effective in reducing manual annotation efforts for supervised learning tasks where a human judge is asked to annotate the most informative examples with respect to a given model. However, in most cases reliable human judges are not available for all languages. In this paper, we propose a cross-lingual unsupervised active learning paradigm (XLADA) that generates high-quality automatically annotated training data from a word-aligned parallel corpus. To evaluate our paradigm, we applied XLADA on English-French and English-Chinese bilingual corpora then we trained French and Chinese information extraction models. The experimental results show that XLADA can produce effective models without manually-annotated training data.

机译：手动标注信息提取模型的训练数据是一个耗时且昂贵的过程，但是对于信息提取系统的构建而言是必需的。事实证明，主动学习可以有效地减少监督学习任务的人工标注工作，在人工监督学习中，要求人类法官为给定模型注释最翔实的示例。但是，在大多数情况下，并非所有语言都有可靠的人类裁判。在本文中，我们提出了一种跨语言的无监督主动学习范例（XLADA），它可以从单词对齐的并行语料库中生成高质量的自动注释的训练数据。为了评估我们的范例，我们在英语-法语和英汉双语语料库上应用了XLADA，然后训练了法语和汉语信息提取模型。实验结果表明，XLADA无需人工注释训练数据即可生成有效模型。

著录项

来源
《Artificial neural networks in pattern recognition》|2014年|23-34|共12页
会议地点 Montreal(CA)
作者
Mohamed Farouk Abdel Hady; Abubakrelsedik Karali; Eslam Kamal; Rania Ibrahim;
展开▼
作者单位

Microsoft Research Cairo, Egypt;

Microsoft Research Cairo, Egypt;

Microsoft Research Cairo, Egypt;

Microsoft Research Cairo, Egypt;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Information extraction; named entity recognition; cross-lingual domain adaptation; unsupervised active learning;

机译：信息提取；命名实体识别；跨语言领域适应；无监督的主动学习;
入库时间 2022-08-26 13:51:22

相似文献

外文文献
中文文献
专利

1. Unsupervised Active Learning of CRF Model for Cross-Lingual Information Extraction [J] . Mohamed Farouk Abdel Hady, Abubakrelsedik Karali, Eslam Kamal, International journal of computational linguistics and applications . 2014,第2期

机译：跨语言信息提取的CRF模型的无监督主动学习
2. Chemical named entity recognition in patents by domain knowledge and unsupervised feature learning [J] . Hua Xu, Hui Chen, Jingqi Wang, Database . 2016,第2010期

机译：通过领域知识和无监督特征学习来识别专利中的化学命名实体
3. LSTM-CRF Models for Named Entity Recognition [J] . Changki LEE IEICE transactions on information and systems . 2017,第4期

机译：用于命名实体识别的LSTM-CRF模型
4. Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition [C] . Mohamed Farouk Abdel Hady, Abubakrelsedik Karali, Eslam Kamal, IAPR TC3 International Workshop on Artificial Neural Networks in Pattern Recognition . 2014

机译：CRF模型的无监督积极学习，用于交叉语言名称实体识别
5. From Preprocessing to Named Entity Recognition, Linking and Clustering in Multilingual, Cross-Lingual, High-Low Resources Settings [D] . Zirikly, Ayah. 2018

机译：从预处理到命名实体识别，多语言，跨语言，高低资源设置中的链接和聚类
6. Wide-scope biomedical named entity recognition and normalization with CRFs fuzzy matching and character level modeling [O] . Suwisa Kaewphan, Kai Hakala, Niko Miekka, 2018

机译：具有CRF模糊匹配和字符级建模的宽范围生物医学命名实体识别和归一化
7. UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data [O] . Qianhui Wu, Zijia Lin, Börje F. Karlsson, 2020

机译：Unitrans：使用未标记数据的交叉命名实体识别的统一模型传输和数据传输

Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition

摘要

著录项

相似文献

相关主题

期刊订阅