首页> 外文期刊>ACM transactions on Asian language information processing >Adversarial Training for Unknown Word Problems in Neural Machine Translation
【24h】

Adversarial Training for Unknown Word Problems in Neural Machine Translation

机译:神经机器翻译中未知单词问题的对抗训练

获取原文
获取原文并翻译 | 示例

摘要

Nearly all of the work in neural machine translation (NMT) is limited to a quite restricted vocabulary, crudely treating all other words the same as an < unk > symbol. For the translation of language with abundant morphology, unknown (UNK) words also come from the misunderstanding of the translation model to the morphological changes. In this study, we explore two ways to alleviate the UNK problem in NMT: a new generative adversarial network (added value constraints and semantic enhancement) and a preprocessing technique that mixes morphological noise. The training process is like a win-win game in which the players are three adversarial sub models (generator, filter, and discriminator). In this game, the filter is to emphasize the discriminator's attention to the negative generations that contain noise and improve the training efficiency. Finally, the discriminator cannot easily discriminate the negative samples generated by the generator with filter and human translations. The experimental results show that the proposed method significantly improves over several strong baseline models across various language pairs and the newly emerged Mongolian-Chinese task is state-of-the-art.
机译:几乎所有的神经机器翻译工作的(NMT)被限制在一个相当受限词汇表,粗略地治疗所有换言之相同的符号。对于具有丰富形态的语言翻译,未知(UNK)词也源于对翻译模型的误解和形态变化。在这项研究中,我们探索了两种缓解NMT中的UNK问题的方法:一种新的生成对抗网络(附加值约束和语义增强)和一种混合形态噪声的预处​​理技术。训练过程就像一个双赢游戏,其中玩家是三个对抗子模型(生成器,过滤器和鉴别器)。在此游戏中,过滤器将强调区分者对包含噪声的负数生成的关注,并提高训练效率。最后,鉴别器不能轻易地通过过滤器和人工翻译来鉴别发生器生成的负样本。实验结果表明,该方法大大改善了跨多种语言对的几种强大的基线模型,并且新出现的蒙汉任务是最先进的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号