Pretrained Language Model Embryology: The Birth of ALBERT

机译：预付费语言模型胚胎学：Albert的诞生

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

While behaviors of pretrained language models (LMs) have been thoroughly examined, what happened during pretraining is rarely studied. We thus investigate the developmental process from a set of randomly initialized parameters to a totipotent language model, which we refer to as the embryology of a pretrained language model. Our results show that ALBERT learns to reconstruct and predict tokens of different parts of speech (POS) in different learning speeds during pretraining. We also find that linguistic knowledge and world knowledge do not generally improve as pretraining proceeds, nor do downstream tasks' performance. These findings suggest that knowledge of a pretrained model varies during pretraining, and having more pretrain steps does not necessarily provide a model with more comprehensive knowledge.

机译：虽然已经彻底检查了预先训练的语言模型（LMS）的行为，但很少研究在预磨练期间发生的事情。因此，我们将从一组随机初始化参数调查到Totipotent语言模型的发展过程，我们将其称为预用语言模型的胚胎学。我们的研究结果表明，艾伯特在预先预防期间学会在不同的学习速度下重建和预测不同部分语音（POS）的令牌。我们还发现语言知识和世界知识通常不会随着预押收益而改善，也不会改善下游任务的表现。这些研究结果表明，预先磨普模型的知识在预测期间变化，并且具有更多的预留下步骤并不一定提供具有更全面的知识的模型。

著录项

来源
《Conference on Empirical Methods in Natural Language Processing》|2020年|6813-6828|共16页
会议地点
作者
Cheng-Han Chiang; Sung-Feng Huang; Hung-yi Lee;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. The Impact of Pretrained Language Models on Negation and Speculation Detection in Cross-Lingual Medical Text: Comparative Study [J] . Renzo Rivera Zavala, Paloma Martinez JMIR Medical Informatics . 2020,第12期

机译：普瑞赖尔语言模型对跨语言医学文本否定和猜测检测的影响：比较研究
2. Extractive Summarization with Very Deep Pretrained Language Model [J] . Yang Gu, Yanke Hu International Journal of Artificial Intelligence & Applications (IJAIA) . 2019,第2期

机译：利用非常深的预用语言模型进行提取综准
3. The Standard Model for Programming Languages: The Birth of a Mathematical Theory of Computation [J] . Simone Martini OASIcs : OpenAccess Series in Informatics . 2020,第2期

机译：编程语言的标准模型：计算数学计算的诞生
4. Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT [C] . Alexandra Chronopoulou, Dario Stojanovski, Alexander Fraser Conference on Empirical Methods in Natural Language Processing . 2020

机译：用Liment Corpora为无人监督的NMT重用预留语言模型
5. A multivariate model of language development in preterm infants from birth to 30 months. [D] . White, Carmel Parker. 1995

机译：从出生到30个月的早产儿语言发展的多元模型。
6. Toward a clinical text encoder: pretraining for clinical natural language processing with applications to substance misuse [O] . Dmitriy Dligach, Majid Afshar, Timothy Miller -1

机译：面向临床文本编码器：对临床自然语言处理的预培训及其在滥用药物方面的应用
7. Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation [O] . Alexandra Chronopoulou, Dario Stojanovski, Alexander Fraser 2021

机译：提高预测神经电机翻译的预训型语言模型的词汇能力

Pretrained Language Model Embryology: The Birth of ALBERT

摘要

著录项

相似文献

相关主题

期刊订阅