首页> 外文会议>Conference on Empirical Methods in Natural Language Processing >Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems
【24h】

Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems

机译:通过FineTuning子字系统向合理的字符级变压器NMT迈向

获取原文

摘要

Applying the Transformer architecture on the character level usually requires very deep architectures that are difficult and slow to train. These problems can be partially overcome by incorporating a segmentation into tokens in the model. We show that by initially training a subword model and then finetuning it on characters, we can obtain a neural machine translation model that works at the character level without requiring token segmentation. We use only the vanilla 6-layer Transformer Base architecture. Our character-level models better capture morphological phenomena and show more robustness to noise at the expense of somewhat worse overall translation quality. Our study is a significant step towards high-performance and easy to train character-based models that are not extremely large.
机译:在字符级别应用变压器架构通常需要非常深的架构,难以训练。通过将分段结合到模型中的令牌中可以部分地克服这些问题。我们展示通过最初训练子字模型然后将其训练在字符上,我们可以获得一个在不需要令牌分割的字符级工作的神经机翻译模型。我们仅使用Vanilla 6层变压器基础架构。我们的性格模型更好地捕获形态现象,并以牺牲总体翻译质量的牺牲更糟糕的噪音表现出更多的稳健性。我们的研究是迈向高性能且易于培训基于角色的型号的重要一步,这些模型并不极大。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号