首页> 外文会议>Annual Meeting of the Association for Computational Linguistics >Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation
【24h】

Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation

机译:用互信息求解改进的变分神经机器翻译Posterior Collapse

获取原文

摘要

This paper proposes a simple and effective approach to address the problem of posterior collapse in conditional variational autoen-coders (CVAEs). It thus improves performance of machine translation models that use noisy or monolingual data, as well as in conventional settings. Extending Transformer and conditional VAEs, our proposed latent variable model measurably prevents posterior collapse by (1) using a modified evidence lower bound (ELBO) objective which promotes mutual information between the latent variable and the target, and (2) guiding the latent variable with an auxiliary bag-of-words prediction task. As a result, the proposed model yields improved translation quality compared to existing variational NMT models on WMT Ro→En and De→En. With latent variables being effectively utilized, our model demonstrates improved robustness over non-latent Transformer in handling uncertainty: exploiting noisy source-side monolingual data (up to +3.2 BLEU), and training with weakly aligned web-mined parallel data (up to +4.7 BLEU).
机译:本文提出了一种简单有效的方法来解决条件变分自动编码器(CVAEs)中的后验崩溃问题。因此,它提高了使用噪声或单语数据的机器翻译模型的性能,以及在常规设置下的性能。通过扩展Transformer和conditional VAEs,我们提出的潜变量模型通过(1)使用改进的证据下限(ELBO)目标,促进潜变量和目标之间的互信息,以及(2)使用辅助词袋预测任务引导潜变量,可测量地防止后崩溃。因此,与WMT Ro上现有的变分NMT模型相比,所提出的模型提高了翻译质量→恩德→EN在有效利用潜在变量的情况下,我们的模型在处理不确定性方面表现出了优于非潜在变压器的鲁棒性:利用噪声源端单语数据(高达+3.2 BLEU),以及使用弱对齐网络挖掘并行数据(高达+4.7 BLEU)进行训练。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号