首页> 外文会议>Annual Meeting of the Association for Computational Linguistics >BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance
【24h】

BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance

机译:BERTRAM:改进的单词嵌入对语境化模型的性能有很大影响

获取原文

摘要

Pretraining deep language models has led to large performance gains in NLP. Despite this success, Schick and Schuetze (2020) recently showed that these models struggle to understand rare words. For static word embeddings, this problem has been addressed by separately learning representations for rare words. In this work, we transfer this idea to pretrained language models: We introduce BERTRAM, a powerful architecture based on BERT that is capable of inferring high-quality embeddings for rare words that are suitable as input representations for deep language models. This is achieved by enabling the surface form and contexts of a word to interact with each other in a deep architecture. Integrating BERTRAM into BERT leads to large performance increases due to improved representations of rare and medium frequency words on both a rare word probing task and three downstream tasks.
机译:NLP在预训练模式中取得了巨大的成绩。尽管取得了这样的成功,Schick和Schuetze(2020)最近表明,这些模型很难理解稀有词。对于静态单词嵌入,这个问题已经通过单独学习稀有单词的表示来解决。在这项工作中,我们将这一想法转移到预训练语言模型中:我们引入了BERTRAM,这是一种基于BERT的强大体系结构,能够推断出适合作为深层语言模型输入表示的稀有词的高质量嵌入。这是通过使一个词的表面形式和上下文在深层架构中相互作用来实现的。由于在稀有词探测任务和三个下游任务中改进了稀有词和中频词的表示,将BERTRAM集成到BERT中可以大大提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号