首页> 外文会议>Workshop on Domain Adaptation for NLP >Analyzing the Domain Robustness of Pretrained Language Models, Layer by Layer
【24h】

Analyzing the Domain Robustness of Pretrained Language Models, Layer by Layer

机译:分析预磨料语言模型的域稳健性,层层

获取原文

摘要

The robustness of pretrained language models (PLMs) is generally measured using performance drops on two or more domains. However, we do not yet understand the inherent robustness achieved by contributions from different layers of a PLM. We systematically analyze the robustness of these representations layer by layer from two perspectives. First, we measure the robustness of representations by using domain divergence between two domains. We find that ⅰ) Domain variance increases from the lower to the upper layers for vanilla PLMs; ⅱ) Models continuously pretrained on domain-specific data (DAPT) (Gururangan ct al., 2020) exhibit more variance than their pretrained PLM counterparts: and that ⅲ) Distilled models (e.g.,DistilBERT) also show greater domain variance. Second, we investigate the robustness of representations by analyzing the encoded syntactic and semantic information using diagnostic probes. We find that similar layers have similar amounts of linguistic information for data from an unseen domain.
机译:预先训练的语言模型(PLMS)的稳健性通常使用两个或更多个域上的性能下降来测量。然而,我们尚未理解来自PLM不同层的贡献所取得的固有稳健性。我们通过两个观点来系统地分析这些表示层的鲁棒性。首先,我们通过在两个域之间使用域分歧来测量表示的稳健性。我们发现Ⅰ)域变异从较低的Vanilla PLMS的上层增加; Ⅱ)在域特定数据(DAPT)上连续净化的模型(Gururangan CT Al.,2020)表现出比其预制PLM对应物更多的差异:Ⅲ)蒸馏模型(例如,Distilbert)也显示出更大的域变异。其次,我们通过使用诊断探针分析编码的句法和语义信息来调查表示的鲁棒性。我们发现类似的层具有类似于看不见域的数据的语言信息量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号