首页> 外文会议>Workshop on neural generation and translation >Learning to Generate Word- and Phrase-Embeddings for Efficient Phrase-Based Neural Machine Translation
【24h】

Learning to Generate Word- and Phrase-Embeddings for Efficient Phrase-Based Neural Machine Translation

机译:学习生成词和短语嵌入,以实现基于短语的高效神经机器翻译

获取原文

摘要

Neural machine translation (NMT) often fails in one-to-many translation, e.g., in the translation of multi-word expressions, compounds, and collocations. To improve the translation of phrases, phrase-based NMT systems have been proposed; these typically combine word-based NMT with external phrase dictionaries or with phrase tables from phrase-based statistical MT systems. These solutions introduce a significant overhead of additional resources and computational costs. In this paper, we introduce a phrase-based NMT model built upon continuous-output NMT, in which the decoder generates embeddings of words or phrases. The model uses a fertility module, which guides the decoder to generate em-beddings of sequences of varying lengths. We show that our model learns to translate phrases better, performing on par with state of the art phrase-based NMT. Since our model does not resort to softmax computation over a huge vocabulary of phrases, its training time is about 112x faster than the baseline.
机译:神经机器翻译(NMT)通常在一对多翻译中失败,例如在多词表达,复合词和搭配词的翻译中。为了改进短语的翻译,已经提出了基于短语的NMT系统。这些通常将基于单词的NMT与外部短语词典或基于短语的统计MT系统中的短语表结合使用。这些解决方案带来了额外资源和计算成本的巨大开销。在本文中,我们介绍了一种基于短语的NMT模型,该模型建立在连续输出NMT之上,其中解码器生成单词或短语的嵌入。该模型使用生育模块,该模块指导解码器生成长度可变的序列的嵌入。我们证明了我们的模型可以更好地翻译短语,与基于词组的NMT的表现相当。由于我们的模型没有在庞大的词组词汇上求助于softmax计算,因此其训练时间比基线快约112倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号