首页> 外文会议>Conference of the European Chapter of the Association for Computational Linguistics >Maximal Multiverse Learning for Promoting Cross-Task Generalization of Fine-Tuned Language Models
【24h】

Maximal Multiverse Learning for Promoting Cross-Task Generalization of Fine-Tuned Language Models

机译:促进微调语言模型的交叉任务泛化的最大多层学习

获取原文

摘要

Language modeling with BERT consists of two phases of (ⅰ) unsupervised pre-training on unlabeled text, and (ⅱ) fine-tuning for a specific supervised task. We present a method that leverages the second phase to its fullest, by applying an extensive number of parallel classifier heads, which are enforced to be orthogonal, while adaptively eliminating the weaker heads during training. We conduct an extensive inter- and intra-dataset evaluation, showing that our method improves the generalization ability of BERT, sometimes leading to a +9% gain in accuracy. These results highlight the importance of a proper fine-tuning procedure, especially for relatively smaller-sized datasets. Our code is attached as supplementary.
机译:用伯特语言建模包括两个阶段(Ⅰ)未推测的未标记文本的预测,(Ⅱ)针对特定监督任务的微调。 我们介绍一种通过应用广泛数量的并联分类器头来利用第二阶段来充分利用第二阶段的方法,这在训练期间自适应地消除较弱的头部。 我们进行了广泛的间数据集和数据间评估,表明我们的方法提高了伯特的泛化能力,有时会导致+ 9%的准确性。 这些结果突出了适当的微调过程的重要性,特别是对于相对较小的数据集。 我们的代码作为补充。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号