首页> 外国专利> KNOWLEDGE DISTILLATION-BASED COMPRESSION METHOD FOR PRE-TRAINED LANGUAGE MODEL, AND PLATFORM

KNOWLEDGE DISTILLATION-BASED COMPRESSION METHOD FOR PRE-TRAINED LANGUAGE MODEL, AND PLATFORM

机译：基于知识蒸馏的压缩方法，用于预先培训的语言模型和平台

页面导航

摘要
著录项
相似文献

摘要

A knowledge distillation-based compression method for a pre-trained language model, and a platform. In the method, a universal feature transfer knowledge distillation strategy is first designed, and in a process of distilling knowledge from a teacher model to a student model, feature maps of each layer of the student model are approximated to features of the teacher model, with emphasis on the feature expression capacity in intermediate layers of the teacher model for small samples, and these features are used to guide the student model; then, the ability of the self-attention distribution of the teacher model to detect semantics and syntax between words is used to construct a knowledge distillation method based on self-attention crossover; and finally, in order to improve the learning quality of early-period training and the generalization ability of late-period training in the learning model, a Bernouli probability distribution-based linear transfer strategy is designed to gradually complete knowledge transfer of the feature map and self-attention distribution from the teacher to the student. By means of the present method, automatic compression is performed on a pre-trained multi-task-oriented language model, improving language model compression efficiency.

机译：基于知识蒸馏的压缩方法，用于预先训练的语言模型和平台。在该方法中，首先设计普遍特征传输知识蒸馏策略，并且在从教师模型到学生模型的蒸馏知识的过程中，学生模型的每层的特征映射近似于教师模型的特征强调小型样本的教师模型中间层中的特征表达能力，这些功能用于指导学生模型;然后，使用教师模型的自我关注分布来检测词语之间的语义和语法的能力用于构建基于自我关注交叉的知识蒸馏方法;最后，为了提高早期培训的学习质量和学习模式的后期训练的泛化能力，旨在逐步完成特征图的知识转移的伯纳利概率分布的线性转移策略从教师到学生的自我关注分配。通过本方法，对预先训练的多任务导向语言模型进行自动压缩，提高语言模型压缩效率。

著录项

公开/公告号WO2021248868A1

专利类型
公开/公告日2021-12-16

原文格式PDF
申请/专利权人 ZHEJIANG LAB;
展开▼

申请/专利号WO2020CN138019
发明设计人 WANG HONGSHENG;SHAN HAIJUN;YANG FEI;
展开▼

申请日2020-12-21
分类号G06F40/211;
国家 CN
入库时间 2024-06-14 22:33:16

相似文献

专利
外文文献
中文文献