首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Reliability-aware Dynamic Feature Composition for Name Tagging
【24h】

Reliability-aware Dynamic Feature Composition for Name Tagging

机译:可靠的名称标签动态特征组合

获取原文

摘要

While word embcddings are widely used for a variety of tasks and substantially improve the performance, their quality is not consistent throughout the vocabulary due to the long-tail distribution of word frequency. Without sufficient contexts, embeddings of rare words are usually less reliable than those of common words. However, current models typically trust all word embeddings equally regardless of their reliability and thus may introduce noise and hurt the performance. Since names often contain rare and unknown words, this problem is particularly critical for name tagging. In this paper, we propose a novel reliability-aware name tagging model to tackle this issue. We design a set of word frequency-based reliability signals to indicate the quality of each word embedding. Guided by the reliability signals, the model is able to dynamically select and compose features such as word embedding and character-level representation using gating mechanisms. For example, if an input word is rare, the model relies less on its word embedding and assigns higher weights to its character and contextual features. Experiments on OntoNotes 5.0 show that our model outperforms the baseline model, obtaining up to 6.2% absolute gain in F-score. In cross-genre experiments on six genres in OntoNotes, our model improves the performance for most genre pairs and achieves 2.3% absolute F-score gain on average.~1
机译:尽管词嵌入广泛用于各种任务,并显着提高了性能,但由于词频的长尾分布,它们的质量在整个词汇表中并不一致。没有足够的上下文,稀有单词的嵌入通常不如普通单词的嵌入可靠。但是,当前的模型通常会同等地信任所有单词嵌入,无论它们的可靠性如何,因此可能会引入噪声并损害性能。由于名称通常包含稀有和未知的单词,因此此问题对于名称标记特别重要。在本文中,我们提出了一种新颖的可靠性感知名称标记模型来解决此问题。我们设计了一组基于词频的可靠性信号来指示每个词嵌入的质量。在可靠性信号的指导下,该模型能够使用选通机制动态选择和组合特征,例如单词嵌入和字符级表示。例如,如果输入的单词很少,则模型较少依赖其单词嵌入,并为其字符和上下文特征分配更高的权重。在OntoNotes 5.0上进行的实验表明,我们的模型优于基线模型,在F评分中获得高达6.2%的绝对增益。在OntoNotes中六种流派的跨流实验中,我们的模型提高了大多数流派对的性能,平均获得了2.3%的绝对F分数增益。〜1

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号