首页> 外文会议>Artificial neural networks in pattern recognition >Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition
【24h】

Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition

机译:跨语言命名实体识别的CRF模型的无监督主动学习

获取原文
获取原文并翻译 | 示例

摘要

Manual annotation of the training data of information extraction models is a time consuming and expensive process but necessary for the building of information extraction systems. Active learning has been proven to be effective in reducing manual annotation efforts for supervised learning tasks where a human judge is asked to annotate the most informative examples with respect to a given model. However, in most cases reliable human judges are not available for all languages. In this paper, we propose a cross-lingual unsupervised active learning paradigm (XLADA) that generates high-quality automatically annotated training data from a word-aligned parallel corpus. To evaluate our paradigm, we applied XLADA on English-French and English-Chinese bilingual corpora then we trained French and Chinese information extraction models. The experimental results show that XLADA can produce effective models without manually-annotated training data.
机译:手动标注信息提取模型的训练数据是一个耗时且昂贵的过程,但是对于信息提取系统的构建而言是必需的。事实证明,主动学习可以有效地减少监督学习任务的人工标注工作,在人工监督学习中,要求人类法官为给定模型注释最翔实的示例。但是,在大多数情况下,并非所有语言都有可靠的人类裁判。在本文中,我们提出了一种跨语言的无监督主动学习范例(XLADA),它可以从单词对齐的并行语料库中生成高质量的自动注释的训练数据。为了评估我们的范例,我们在英语-法语和英汉双语语料库上应用了XLADA,然后训练了法语和汉语信息提取模型。实验结果表明,XLADA无需人工注释训练数据即可生成有效模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号