首页> 外文会议>International joint conference on natural language processing >Cloze-driven Pretraining of Self-attention Networks
【24h】

Cloze-driven Pretraining of Self-attention Networks

机译:推动了自我关注网络的预借预借鉴

获取原文

摘要

We present a new approach for pretraining a bi-directional transformer model that provides significant performance gains across a variety of language understanding problems. Our model solves a cloze-style word reconstruction task, where each word is ablated and must be predicted given the rest of the text. Experiments demonstrate large performance gains on GLUE and new state of the art results on NEK as well as constituency parsing benchmarks, consistent with BERT. We also present a detailed analysis of a number of factors that contribute to effective pretraining, including data domain and size, model capacity, and variations on the cloze objective.
机译:我们提出了一种预先绘制双向变压器模型的新方法,可在各种语言理解问题上提供显着性能。我们的模型解决了一个强化样式的词重建任务,其中每个单词都被烧蚀,必须在给出其余文本的情况下预测。实验证明了胶水和新的最新状态的巨大性能导致NEK以及选区解析基准,与BERT一致。我们还对许多因素进行了详细分析,这些因素有助于有效预测,包括数据领域和大小,模型容量以及隐冻目标的变化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号