首页> 外文会议>Workshop on Asian translation >Overcoming the Rare Word Problem for Low-Resource Language Pairs in Neural Machine Translation
【24h】

Overcoming the Rare Word Problem for Low-Resource Language Pairs in Neural Machine Translation

机译:克服神经机翻译中低资源语言对的罕见词问题

获取原文

摘要

Among the six challenges of neural machine translation (NMT) coined by (Koehn and Knowles, 2017), rare-word problem is considered the most severe one, especially in translation of low-resource languages. In this paper, we propose three solutions to address the rare words in neural machine translation systems. First, we enhance source context to predict the target words by connecting directly the source embeddings to the output of the attention component in NMT. Second, we propose an algorithm to learn morphology of unknown words for English in supervised way in order to minimize the adverse effect of rare-word problem. Finally, we exploit synonymous relation from the WordNet to overcome out-of-vocabulary (OOV) problem of NMT. We evaluate our approaches on two low-resource language pairs: English-Vietnamese and Japanese-Vietnamese. In our experiments, we have achieved significant improvements of up to roughly +1.0 BLKU points in both language pairs.
机译:在(Koehn和Knowles,2017)上的神经机翻译(NMT)的六个挑战中,稀有词问题被认为是最严重的问题,特别是在低资源语言的翻译中。 在本文中,我们提出了三种解决方案来解决神经机翻译系统中的稀有词。 首先,我们通过将源嵌入的源嵌入连接到NMT中的注意组件的输出来增强源上下文来预测目标单词。 其次,我们提出了一种算法在监督方式中学习未知词的形态,以最大限度地减少稀有词问题的不利影响。 最后,我们利用Wordnet的同义词关系来克服NMT的词汇(OOV)问题。 我们评估了两种低资源语言对的方法:英语 - 越南和日本越南语。 在我们的实验中,我们在两种语言对中取得了大约+1.0的Blku点的显着改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号