首页> 外文会议>Conference on Empirical Methods in Natural Language Processing >From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers
【24h】

From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers

机译:从零到英雄:关于零拍语言转移与多语言变压器的局限性

获取原文

摘要

Massively multilingual transformers (MMTs) pretrained via language modeling (e.g., mBERT, XLM-R) have become a default paradigm for zero-shot language transfer in NLP, offering unmatched transfer performance. Current evaluations, however, verify their efficacy in transfers (a) to languages with sufficiently large pretraining corpora, and (b) between close languages. In this work, we analyze the limitations of downstream language transfer with MMTs, showing that, much like cross-lingual word embeddings, they are substantially less effective in resource-lean scenarios and for distant languages. Our experiments, encompassing three lower-level tasks (POS tagging, dependency parsing, NER) and two high-level tasks (NLI, QA), empirically correlate transfer performance with linguistic proximity between source and target languages, but also with the size of target language corpora used in MMT pretraining. Most importantly, we demonstrate that the inexpensive few-shot transfer (i.e., additional fine-tuning on a few target-language instances) is surprisingly effective across the board, warranting more research efforts reaching beyond the limiting zero-shot conditions.
机译:通过语言建模(例如,MBERT,XLM-R)的大规模多语言变压器(MMT)已成为NLP中零拍语言传输的默认范式,提供无与伦比的传输性能。然而,目前的评估验证了它们在与足够大的预先预测语料库中的转移(a)的疗效,以及(b)在密封语言之间。在这项工作中,我们分析了与MMT的下游语言转移的局限性,表明,与交叉语言嵌入式相同,它们在资源精度方案和远处语言中基本上不太有效。我们的实验,包括三个级别的任务(POS标记,依赖解析,NER)和两个高级任务(NLI,QA),在源语言和目标语言之间具有语言接近的传递性能,但也具有目标的大小语言语料库使用MMT预制。最重要的是,我们证明了廉价的几次射击转移(即,几个目标语言实例上的额外微调)在董事会上令人惊讶地有效,这需要更多的研究工作,超出限制零射击条件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号