首页> 外文会议>Workshop on Asian translation >Residual Stacking of RNNs for Neural Machine Translation
【24h】

Residual Stacking of RNNs for Neural Machine Translation

机译:用于神经机翻译的RNN的残余堆叠

获取原文

摘要

To enhance Neural Machine Translation models, several obvious ways such as enlarging the hidden size of recurrent layers and stacking multiple layers of RNN can be considered. Surprisingly, we observe that using naively stacked RNNs in the decoder slows down the training and leads to degradation in performance. In this paper, We demonstrate that applying residual connections in the depth of stacked RNNs can help the optimization, which is referred to as residual stacking. In empirical evaluation, residual stacking of decoder RNNs gives superior results compared to other methods of enhancing the model with a fixed parameter budget. Our submitted systems in WAT2016 are based on a NMT model ensemble with residual stacking in the decoder. To further improve the performance, we also attempt various methods of system combination in our experiments.
机译:为了增强神经机器翻译模型,可以考虑扩大复发层的隐藏尺寸和堆叠多层RNN的多种明显方式。令人惊讶的是,我们观察到解码器中的天然堆叠的RNN减慢训练并导致性能下降。在本文中,我们证明在堆叠的RNN的深度中应用残留连接可以帮助优化,这被称为残差堆叠。在经验评估中,与通过固定参数预算增强模型的其他方法相比,RNN的残余堆叠给出了优异的结果。我们在WAT2016中提交的系统基于NMT模型集合,在解码器中具有残留堆叠。为了进一步提高性能,我们还尝试在我们的实验中尝试各种系统组合方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号