首页> 外文会议>International conference on computational linguistics >Recurrent Neural Network-based Tuple Sequence Model for Machine Translation
【24h】

Recurrent Neural Network-based Tuple Sequence Model for Machine Translation

机译:基于递归神经网络的元组序列模型用于机器翻译

获取原文

摘要

In this paper, we propose a recurrent neural network-based tuple sequence model (RNNTSM) that can help phrase-based translation model overcome the phrasal independence assumption. Our RNNTSM can potentially capture arbitrary long contextual information during estimating probabilities of tuples in continuous space. It, however, has severe data sparsity problem due to the large tuple vocabulary coupled with the limited bilingual training data. To tackle this problem, we propose two improvements. The first is to factorize bilingual tuples of RNNTSM into source and target sides, we call factorized RNNTSM. The second is to decompose phrasal bilingual tuples to word bilingual tuples for providing fine-grained tuple model. Our extensive experimental results on the IWSLT2012 test sets showed that the proposed approach essentially improved the translation quality over state-of-the-art phrase-based translation systems (baselines) and recurrent neural network language models (RNNLMs). Compared with the baselines, the BLEU scores on English-French and English-German tasks were greatly enhanced by 2.1%-2.6% and 1.8%-2.1%, respectively.
机译:在本文中,我们提出了一种基于递归神经网络的元组序列模型(RNNTSM),该模型可以帮助基于短语的翻译模型克服短语独立性假设。在估计连续空间中元组的概率时,我们的RNNTSM可能会捕获任意长的上下文信息。但是,由于元组词汇量大,并且双语培训数据有限,因此存在严重的数据稀疏性问题。为了解决这个问题,我们提出了两项​​改进。首先是将RNNTSM的双语元组分解为源端和目标端,我们称之为分解的RNNTSM。第二个是将短语双语元组分解为单词双语元组,以提供细粒度的元组模型。我们在IWSLT2012测试集上进行的广泛实验结果表明,与最新的基于短语的翻译系统(基准)和递归神经网络语言模型(RNNLM)相比,该提议的方法实质上提高了翻译质量。与基线相比,英语-法语和英语-德语任务的BLEU分数分别大大提高了2.1%-2.6%和1.8%-2.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号