首页> 外文会议>International Joint Conference on Natural Language Processing;Annual Meeting of the Association for Computational Linguistics >Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization
【24h】

Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization

机译:预先训练的语言模型中的超级票:从模型压缩到改善泛化

获取原文

摘要

The Lottery Ticket Hypothesis suggests that an over-parametrized network consists of "lottery tickets", and training a certain collection of them (i.e., a subnetwork) can match the performance of the full model. In this paper, we study such a collection of tickets, which is referred to as "winning tickets", in extremely over-parametrized models, e.g., pre-trained language models. We observe that at certain compression ratios, the generalization performance of the winning tickets can not only match but also exceed that of the full model. In particular, we observe a phase transition phenomenon: As the compression ratio increases, generalization performance of the winning tickets first improves then deteriorates after a certain threshold. We refer to the tickets on the threshold as "super tickets". We further show that the phase transition is task and model dependent - as the model size becomes larger and the training data set becomes smaller, the transition becomes more pronounced. Our experiments on the GLUE benchmark show that the super tickets improve single task fine-tuning by 0.9 points on BERT-base and 1.0 points on BERT-large, in terms of task-average score. We also demonstrate that adaptively sharing the super tickets across tasks benefits multi-task learning.
机译:彩票假设表明,过度参数化网络由“彩票票”组成,并培训它们的某些集合(即,子网)可以匹配完整模型的性能。在本文中,我们研究了这样的一系列票证,它被称为“获胜门票”,其在极端过度参数化模型中,例如,预先训练的语言模型。我们观察到,在某些压缩比率下,获奖票的泛化性能不仅可以匹配,而且也超过完整模型的匹配。特别地,我们观察到相变现象:随着压缩比增加,获胜门票的泛化性能首先改善在某个阈值之后劣化。我们将门票上的门票称为“超级票”。我们进一步表明,相位转换是任务和模型依赖 - 随着模型尺寸变大并且训练数据集变小,转换变得更加明显。我们对胶水基准测试的实验表明,超级票改善了单一任务精细调整,在伯特基地上的0.9点和Bert-Light的1.0点,在任务平均分数方面。我们还证明,适自适放的任务中的超级票益处了多任务学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号