首页> 外文会议>International conference on computational linguistics >Recurrent Neural Network-based Tuple Sequence Model for Machine Translation
【24h】

Recurrent Neural Network-based Tuple Sequence Model for Machine Translation

机译:基于内部的机器翻译中的复发性神经网络组元组序列模型

获取原文

摘要

In this paper, we propose a recurrent neural network-based tuple sequence model (RNNTSM) that can help phrase-based translation model overcome the phrasal independence assumption. Our RNNTSM can potentially capture arbitrary long contextual information during estimating probabilities of tuples in continuous space. It, however, has severe data sparsity problem due to the large tuple vocabulary coupled with the limited bilingual training data. To tackle this problem, we propose two improvements. The first is to factorize bilingual tuples of RNNTSM into source and target sides, we call factorized RNNTSM. The second is to decompose phrasal bilingual tuples to word bilingual tuples for providing fine-grained tuple model. Our extensive experimental results on the IWSLT2012 test sets showed that the proposed approach essentially improved the translation quality over state-of-the-art phrase-based translation systems (baselines) and recurrent neural network language models (RNNLMs). Compared with the baselines, the BLEU scores on English-French and English-German tasks were greatly enhanced by 2.1%-2.6% and 1.8%-2.1%, respectively.
机译:在本文中,我们提出了一个经常性的基于神经网络的元组序列模型(RNNTSM),可以帮助基于短语的翻译模型克服短语独立性假设。我们的RNNTSM可以在连续空间估计元组的概率期间潜在地捕获任意长的上下文信息。然而,它有严重的数据稀疏问题,由于大元组词汇再加上有限的双语训练数据。为了解决这个问题,我们提出了两个改进。首先是RNNTSM双语元组因式分解成源和目标两面,我们叫因式分解RNNTSM。二是分解短语双语元组字用于提供细粒度元组模型双语元组。在IWSLT2012测试集我们广泛的实验结果表明,该方法本质上改善了国家的最先进的基于短语的翻译系统(基线)和复发性神经网络语言模型(RNNLMs)的翻译质量。与基线相比,在英语,法语和英语,德语任务的BLEU分数分别为2.1%-2.6%和1.8%-2.1%,显着增强。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号