EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets

机译：早起：通过早鸟彩票票价培训的高效培训

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Heavily overparameterized language models such as BERT, XLNet and T5 have achieved impressive success in many NLP tasks. However, their high model complexity requires enormous computation resources and extremely long training time for both pre-training and tine-tuning. Many works have studied model compression on large NLP models, but only focusing on reducing inference time while still requiring an expensive training process. Other works use extremely large batch sizes to shorten the pre-training time, at the expense of higher computational resource demands. In this paper, inspired by the Early-Bird Lottery Tickets recently studied for computer vision tasks, we propose EarlyBERT, a general computationally-efficient training algorithm applicable to both pre-training and tine-tuning of large-scale language models. By slimming the self-attention and fully-connected sub-layers inside a transformer, we are the first to identify structured winning tickets in the early stage of BERT training. We apply those tickets towards efficient BERT training, and conduct comprehensive pre-training and fine-tuning experiments on GLUE and SQuAD downstream tasks. Our results show that EarlyBERT achieves comparable performance to standard BERT, with 35～45% less training time.

机译：诸如BERT，XLNET和T5等诸如BERT，XLNET和T5之类的庞大的分参考模型在许多NLP任务中取得了令人印象深刻的成功。然而，它们的高模型复杂性需要巨大的计算资源和极长的训练时间，适用于预训练和调整。许多作品已经研究了大型NLP模型的模型压缩，但仅关注减少推理时间，同时仍然需要昂贵的训练过程。其他作品使用极大的批量尺寸来缩短预训练时间，以牺牲更高的计算资源需求。在本文中，由最近研究了计算机愿景任务的早期彩票机票，我们提出了一种适用于预训练和调整大型语言模型的一般计算有效的培训算法。通过减肥变压器内的自我关注和完全连接的子层，我们是第一个在BERT培训的早期识别结构化获奖票。我们将这些门票应用于高效的BERT培训，并对胶水和小队下游任务进行全面的预训练和微调实验。我们的研究结果表明，早期均为标准伯特达到了可比性，培训时间较少35〜45％。

著录项

来源
《Annual Meeting of the Association for Computational Linguistics;International Joint Conference on natural Language Processing》|2021年|2195-2207|共13页
会议地点
作者
Xiaohan Chen; Yu Cheng; Shuohang Wang; Zhe Gan; Zhangyang Wang; Jingjing Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. HAPPY LOTTERY WINNERS AND LOTTERY-TICKET BIAS [J] . Kim Seonghoon, Oswald Andrew J. Review of Income and Wealth . 2021,第2期

机译：快乐彩票获奖者和彩票偏见
2. Lottery ticket and instant win ticket gambling: exploring the distinctions [J] . Alexander M. Penney, Dwight Mazmanian, John Jamieson, Journal of gambling issues . 2015,第30期

机译：彩票和即赢彩票赌博：探索区别
3. “Like filling a lottery ticket with quite high stakes”: a qualitative study exploring mothers’ needs and perceptions of state-provided financial support for a child with a long-term illness in Finland [J] . Anna Paajanen, Kristi Sidney Annerstedt, Salla Atkins BMC Public Health . 2021,第1期

机译：“喜欢用相当高的赌注填补彩票”：一个定性的研究，探索母亲的需求和对芬兰长期疾病的儿童的国家提供的财政支持的看法
4. When BERT Plays the Lottery, All Tickets Are Winning [C] . Sai Prasanna, Anna Rogers, Anna Rumshisky Conference on Empirical Methods in Natural Language Processing . 2020

机译：当BERT播放彩票时，所有门票都在获胜
5. Banking on lottery tickets: A behavioural study of prize-linked savings. [D] . Laughren, Kevin. 2013

机译：彩票中的资金：与奖赏相关的储蓄的行为研究。
6. Like filling a lottery ticket with quite high stakes: a qualitative study exploring mothers’ needs and perceptions of state-provided financial support for a child with a long-term illness in Finland [O] . Anna Paajanen, Kristi Sidney Annerstedt, Salla Atkins 2021

机译：喜欢用相当高的赌注填补彩票：一个定性研究探索母亲的需求和对芬兰长期疾病的儿童的国家提供的财政支持的看法
7. What Numbers to Choose for My Lottery Ticket? Behavior Anomalies in the Chinese Online Lottery Market [O] . Ding, J. 2011

机译：我的彩票可以选择哪些号码？中国在线彩票市场的行为异常

EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets

摘要

著录项

相似文献

相关主题

期刊订阅