【24h】

Case-Sensitive Neural Machine Translation

机译:区分大小写的神经机器翻译

获取原文

摘要

Even as an important lexical information for Latin languages, word case is often ignored in machine translation. According to observations, the translation performance drops significantly when we introduce case-sensitive evaluation metrics. In this paper, we introduce two types of case-sensitive neural machine translation (NMT) approaches to alleviate the above problems: ⅰ) adding case tokens into the decoding sequence, and ⅱ) adopting case prediction to the conventional NMT. Our proposed approaches incorporate case information to the NMT decoder by jointly learning target word generation and word case prediction. We compare our approaches with multiple kinds of baselines including NMT with naive case-restoration methods and analyze the impacts of various setups on our approaches. Experimental results on three typical translation tasks (Zh-En, En-Fr, En-De) show that our proposed methods lead to the improvements up to 2.5, 1.0 and 0.5 in case-sensitive BLEU scores respectively. Further analyses also illustrate the inherent reasons why our approaches lead to different improvements on different translation tasks.
机译:即使作为拉丁语言的重要词汇信息,单词大小写在机器翻译中也经常被忽略。根据观察,当我们引入区分大小写的评估指标时,翻译性能会显着下降。在本文中,我们介绍了两种类型的区分大小写的神经机器翻译(NMT)方法来缓解上述问题:ⅰ)在解码序列中添加案例标记,以及ⅱ)在常规NMT中采用案例预测。我们提出的方法通过共同学习目标单词生成和单词大小写预测来将案例信息合并到NMT解码器中。我们将我们的方法与包括NMT在内的多种基线的天真的案例还原方法进行了比较,并分析了各种设置对我们方法的影响。在三种典型翻译任务(Zh-En,En-Fr,En-De)上的实验结果表明,我们提出的方法分别使区分大小写的BLEU得分分别提高了2.5、1.0和0.5。进一步的分析也说明了我们的方法导致对不同翻译任务进行不同改进的内在原因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号