首页> 外文会议>22nd conference on computational natural language learning >Bidirectional Generative Adversarial Networks for Neural Machine Translation
【24h】

Bidirectional Generative Adversarial Networks for Neural Machine Translation

机译:用于神经机器翻译的双向生成对抗网络

获取原文
获取原文并翻译 | 示例

摘要

Generative Adversarial Network (GAN) has been proposed to tackle the exposure bias problem of Neural Machine Translation (NMT). However, the discriminator typically results in the instability of the GAN training due to the inadequate training problem: the search space is so huge that sampled translations are not sufficient for discriminator training. To address this issue and stabilize the GAN training, in this paper, we propose a novel Bidirectional Generative Adversarial Network for Neural Machine Translation (BGAN-NMT), which aims to introduce a generator model to act as the discriminator, whereby the discriminator naturally considers the entire translation space so that the inadequate training problem can be alleviated. To satisfy this property, generator and discriminator are both designed to model the joint probability of sentence pairs, with the difference that, the generator decomposes the joint probability with a source language model and a source-to-target translation model, while the discriminator is formulated as a target language model and a target-to-source translation model. To further leverage the symmetry of them, an auxiliary GAN is introduced and adopts generator and discriminator models of original one as its own discriminator and generator respectively. Two GANs are alternately trained to update the parameters. Experiment results on German-English and Chinese-English translation tasks demonstrate that our method not only stabilizes GAN training but also achieves significant improvements over baseline systems.
机译:已经提出了生成对抗网络(GAN)来解决神经机器翻译(NMT)的曝光偏差问题。但是,由于训练问题不足,鉴别器通常会导致GAN训练不稳定:搜索空间太大,以至于抽样翻译不足以进行鉴别器训练。为了解决此问题并稳定GAN训练,在本文中,我们提出了一种新颖的神经机器翻译双向生成对抗网络(BGAN-NMT),旨在引入生成器模型来充当鉴别器,从而使鉴别器自然考虑整个翻译空间,从而可以缓解培训不足的问题。为了满足此特性,生成器和鉴别器都设计为对句子对的联合概率建模,不同之处在于,生成器使用源语言模型和源到目标翻译模型分解联合概率,而鉴别器为制定为目标语言模型和目标到源的翻译模型。为了进一步利用它们的对称性,引入了一个辅助GAN,并采用原始模型的生成器和鉴别器模型分别作为自己的鉴别器和生成器。两个GAN被交替训练以更新参数。关于德语-英语和汉-英语翻译任务的实验结果表明,我们的方法不仅稳定了GAN训练,而且在基线系统上也取得了显着改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号