【24h】

Mask-Predict: Parallel Decoding of Conditional Masked Language Models

机译:屏蔽预测:条件屏蔽语言模型的并行解码

获取原文

摘要

Most machine translation systems generate text autoregressively from left to right. We, instead, use a masked language modeling objective to train a model to predict any subset of the target words, conditioned on both the input text and a partially masked target translation. This approach allows for efficient iterative decoding, where we first predict all of the target words non-auloregressively, and then repeatedly mask out and regenerate the subset of words that the model is least confident about. By applying this strategy for a constant number of iterations, our model improves state-of-the-art performance levels for non-autoregressive and parallel decoding translation models by over 4 BLEU on average. It is also able to reach within about 1 BLEU point of a typical left-to-right transformer model, while decoding significantly faster.~1
机译:大多数机器翻译系统从左到右自动生成文本。取而代之的是,我们使用掩蔽的语言建模目标来训练模型,以根据输入文本和部分掩蔽的目标翻译来预测目标词的任何子集。这种方法允许进行有效的迭代解码,在这种解码中,我们首先以非渐进方式预测所有目标词,然后反复屏蔽掉并重新生成模型最不信任的词子集。通过对恒定数量的迭代应用此策略,我们的模型将非自回归和并行解码转换模型的最新性能水平平均提高了4个BLEU。它也能够达到典型左右转换器模型的大约1 BLEU点之内,而解码速度则要快得多。〜1

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号