首页> 外文会议>International conference on computational linguistics >Bitext Name Tagging for Cross-lingual Entity Annotation Projection
【24h】

Bitext Name Tagging for Cross-lingual Entity Annotation Projection

机译:用于跨语言实体注释投影的双文本名称标记

获取原文

摘要

Annotation projection is a practical method to deal with the low resource problem in incident languages (IL) processing. Previous methods on annotation projection mainly relied on word alignment results without any training process, which led to noise propagation caused by word alignment errors. In this paper, we focus on the named entity recognition (NER) task and propose a weakly-supervised framework to project entity annotations from English to IL through bitexts. Instead of directly relying on word alignment results, this framework combines advantages of rule-based methods and deep learning methods by implementing two steps: First, generates a high-confidence entity annotation set on IL side with strict searching methods; Second, uses this high-confidence set to weakly supervise the model training. The model is finally used to accomplish the projecting process. Experimental results on two low-resource ILs show that the proposed method can generate better annotations projected from English-IL parallel corpora. The performance of IL name tagger can also be improved significantly by training on the newly projected IL annotation set.
机译:注释投影是一种处理事件语言(IL)处理中资源不足问题的实用方法。先前的注释投影方法主要依靠单词对齐结果,而没有经过任何训练,这会导致由单词对齐错误引起的噪声传播。在本文中,我们将重点放在命名实体识别(NER)任务上,并提出一个弱监督框架,以通过bitexts将实体注释从英语投影到IL。该框架不是直接依赖于单词对齐结果,而是通过执行两个步骤来结合基于规则的方法和深度学习方法的优点:首先,使用严格的搜索方法在IL端生成高可信度实体注释集;其次,使用此高可信度集合弱监督模型训练。该模型最终用于完成投影过程。在两个低资源信息素上的实验结果表明,该方法可以产生更好的英语-英语平行语料库注解。通过对新投影的IL注释集进行训练,还可以显着提高IL名称标记器的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号