Noise Stability Regularization for Improving BERT Fine-tuning

机译：改善BERT微调的噪声稳定正规化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Fine-tuning pre-trained language models such as BERT has become a common practice dominating leaderboards across various NLP tasks. Despite its recent success and wide adoption, this process is unstable when there are only a small number of training samples available. The brittleness of this process is often reflected by the sensitivity to random seeds. In this paper, we propose to tackle this problem based on the noise stability property of deep nets, which is investigated in recent literature (Arora et al., 2018; Sanyal et al., 2020). Specifically, we introduce a novel and effective regularization method to improve fine-tuning on NLP tasks, referred to as Layer-wise Noise Stability Regularization (LNSR). We extend the theories about adding noise to the input and prove that our method gives a stabler regularization effect. We provide supportive evidence by experimentally confirming that well-performing models show a low sensitivity to noise and fine-tuning with LNSR exhibits clearly higher generalizability and stability. Furthermore, our method also demonstrates advantages over other state-of-the-art algorithms including L~2-SP (Li et al., 2018), Mixout (Lee et al., 2020) and SMART (Jiang et al., 2020).

机译：微调培训的预训练语言模型，如BERT已成为跨各种NLP任务的领导板的常见实践。尽管最近的成功和广泛的采用，但是当只有少量训练样本提供时，这种过程是不稳定的。该过程的脆性通常被对随机种子的敏感性反射。在本文中，我们提出基于深网络的噪声稳定性来解决这一问题，最近的文献（Arora等，2018; Sanyal等，2020）。具体地，我们介绍了一种新颖且有效的正则化方法，以改善对NLP任务的微调，称为层明智的噪声稳定正则化（LNSR）。我们扩展了对输入添加噪声的理论，并证明我们的方法提供了稳定的正则化效果。我们通过通过实验证实，良好的模型显示对噪声的良好敏感性和利用LNSR的微调呈现出明显较高的普遍性和稳定性，提供了良好的模型。此外，我们的方法还展示了包括L〜2-SP（Li等人，2018），混合（Lee等，2020）和Smart（Jiang等，2020）的其他最先进的算法）。

著录项

来源
《Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2021年|3229-3241|共13页
会议地点
作者
Hang Hua; Xingjian Li; Dejing Dou; Chengzhong Xu; Jiebo Luo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Improving the brightness and photostability of NIR fluorescent silica nanoparticles through rational fine-tuning of the covalent encapsulation methods [J] . Jiao Long, Song Fengling, Zhang Biyou, Journal of Materials Chemistry, B. materials for biology and medicine . 2017,第26期

机译：通过合理微调的共价包封方法改善NIR荧光二氧化硅纳米粒子的亮度和光稳定性
2. Effects of total variation regularization noise reduction algorithm in improved K-edge log-subtraction X-ray images with photon-counting cadmium telluride detectors [J] . Kim Kyuseok, Lee Youngjin Optik: Zeitschrift fur Licht- und Elektronenoptik: = Journal for Light-and Electronoptic . 2020,第期

机译：具有光子计数镉碲化镉检测器改进的K边数对数X射线图像中总变化正规化降噪算法的影响
3. Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled α-Hinge Loss with Non-smooth Regularizer [J] . Manisha Singla, Debdas Ghosh, K. K. Shukla Neural processing letters . 2020,第3期

机译：基于具有非平滑常规器的重新定义α铰链损耗，改善了支持向量机的疲劳稳健性稳健性稳健性
4. An improved regularization method for voltage stability analysis [C] . Tong Wu, Fischl, R. . 1997

机译：一种改进的电压稳定度分析正则化方法
5. Dictionary-Based Data Generation for Fine-Tuning Bert for Adverbial Paraphrasing Tasks [D] . Carthon, Mark, III. 2020

机译：用于状语释义任务的微调伯爵的文章生成
6. Improving stability and understandability of genotype-phenotype mapping in Saccharomyces using regularized variable selection in L-PLS regression [O] . Tahir Mehmood, Jonas Warringer, Lars Snipen, 2012

机译：使用L-PLS回归中的规则变量选择来提高酿酒酵母基因型-表型作图的稳定性和可理解性
7. Noise-Sampling Cross Entropy Loss: Improving Disparity Regression Via Cost Volume Aware Regularizer [O] . Yang Chen, Zongqing Lu, Xuechen Zhang, 2020

机译：噪声采样交叉熵损耗：通过成本卷感知常规改善差异回归

Noise Stability Regularization for Improving BERT Fine-tuning

摘要

著录项

相似文献

相关主题

期刊订阅