首页> 外文期刊>Neurocomputing >Generative adversarial training for neural machine translation
【24h】

Generative adversarial training for neural machine translation

机译:神经机器翻译的生成对抗训练

获取原文
获取原文并翻译 | 示例

摘要

Neural machine translation (NMT) is typically optimized to generate sentences which cover n-grams with ground target as much as possible. However, it is widely acknowledged that n-gram precisions, the manually designed approximate loss function, may mislead the model to generate suboptimal translations. To solve this problem, we train the NMT model to generate human-like translations directly by using the generative adversarial net, which has achieved great success in computer vision. In this paper, we build a conditional sequence generative adversarial net (CSGAN-NMT) which comprises of two adversarial sub models, a generative model (generator) which translates the source sentence into the target sentence as the traditional NMT models do and a discriminative model (discriminator) which discriminates the machine-translated target sentence from the human-translated one. The two sub models play a mini max game and achieve a win-win situation when reaching a Nash Equilibrium. As a variant of the single generator-discriminator model, the multi-CSGAN-NMT which contains multiple discriminators and generators, is also proposed. In the multi-CSGAN-NMT model, each generator is viewed as an agent which can interact with others and even transfer messages. Experiments show that the proposed CSGAN-NMT model obtains substantial improvements than the strong baseline and the improvement of the multi-CSGAN-NMT model is more remarkable. (C) 2018 Elsevier B.V. All rights reserved.
机译:通常会对神经机器翻译(NMT)进行优化,以生成尽可能覆盖地面目标的n-gram的句子。但是,众所周知的是,人工设计的近似损失函数n-gram精度可能会误导模型以生成次优转换。为了解决这个问题,我们训练了NMT模型,通过使用生成的对抗性网络直接生成类似人的翻译,这在计算机视觉方面取得了巨大的成功。在本文中,我们建立了一个条件序列生成对抗网络(CSGAN-NMT),该网络由两个对抗子模型组成,一个生成模型(生成器),该生成模型将源语句转换为目标语句,就像传统的NMT模型一样,还有一个判别模型(判别器)区分机器翻译的目标句子和人类翻译的目标句子。这两个子模型玩一个mini max游戏,并在达到Nash平衡时实现双赢。作为单生成器-鉴别器模型的变体,还提出了包含多个鉴别器和生成器的多CSGAN-NMT。在多CSGAN-NMT模型中,每个生成器都被视为可以与其他人交互甚至传输消息的代理。实验表明,所提出的CSGAN-NMT模型比强基线具有较大的改进,而多CSGAN-NMT模型的改进更为显着。 (C)2018 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2018年第10期|146-155|共10页
  • 作者单位
  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号