首页> 外文会议>International Joint Conference on Natural Language Processing;Annual Meeting of the Association for Computational Linguistics >BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition
【24h】

BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition

机译:用于多源弱监管命名实体识别的隐马尔可夫模型

获取原文

摘要

We study the problem of learning a named entity recognition (NER) tagger using noisy labels from multiple weak supervision sources. Though cheap to obtain, the labels from weak supervision sources are often incomplete, inaccurate, and contradictory, making it difficult to learn an accurate NER model. To address this challenge, we propose a conditional hidden Markov model (CHMM), which can effectively infer true labels from multi-source noisy labels in an unsupervised way. CHMM enhances the classic hidden Markov model with the contextual representation power of pre-trained language models. Specifically, CHMM learns token-wise transition and emission probabilities from the BERT embeddings of the input tokens to infer the latent true labels from noisy observations. We further refine CHMM with an alternate-training approach (CHMM-ALT). It fine-tunes a BERT-NER model with the labels inferred by CHMM, and this BERT-NER's output is regarded as an additional weak source to train the CHMM in return. Experiments on four NER benchmarks from various domains show that our method outperforms state-of-the-art weakly supervised NER models by wide margins.
机译:我们研究使用来自多个弱监管源的噪声标签学习命名实体识别(ner)标记的问题。虽然获得便宜的获得,来自弱监管来源的标签通常不完整,不准确和矛盾,使得难以学习一个准确的新型号。为了解决这一挑战,我们提出了一个有条件的隐马尔可夫模型(CHMM),它可以以无人监督的方式有效地从多源噪声标签中推断出真正的标签。 CHMM增强了经典的隐马尔可夫模型,具有预先接受的语言模型的上下文表示功率。具体而言,CHMM从输入令牌的BERT嵌入物中学习令牌 - WISE转换和发射概率,以从嘈杂的观察中推断潜在的真品标签。我们进一步通过备用训练方法(CHMM-ALT)来细化CHMM。使用CHMM推断的标签进行精细调整BERT-NER模型,并且该BERT-NER的输出被视为额外的弱源来训练CHMM作为返回。各个域的四个基准测试的实验表明,我们的方法优于宽边缘的最先进的虚线监督。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号