首页> 外文会议>IEEE International Conference on Computer and Communication Systems >Research on Unknown Words Processing of Mongolian-Chinese Neural Machine Translation Based on Semantic Similarity
【24h】

Research on Unknown Words Processing of Mongolian-Chinese Neural Machine Translation Based on Semantic Similarity

机译:基于语义相似度的蒙汉神经机器翻译未知词处理研究

获取原文

摘要

In this paper, the unknown words processing method in Mongolian-Chinese neural machine translation based on the similarity model and based on the Mongolian-Chinese alignment dictionary are studied. The unknown words processing method based on the similarity model uses the word vector to capture the characteristics of word semantics and grammatical information, calculates the semantic similarity between the unknown words and the words in the vocabulary, and selects the words with the closest semantics to replace all the unknown words. The unknown word processing method based on the Mongolian-Chinese alignment dictionary is to replace the unknown words with the word alignment information. Finally, the original corpus and the new corpus replacing the unknown words are merged and Training model. The final experiment showed that the translation's BLEU value was increased by 0.95 percentage points in the Mongolian-Chinese translation task.
机译:本文研究了基于相似度模型和基于蒙汉对齐词典的蒙汉神经机器翻译中的未知词处理方法。一种基于相似度模型的未知词处理方法,利用词向量捕获词语义和语法信息的特征,计算出未知词与词汇中词的语义相似度,选择语义最接近的词进行替换所有未知的单词。基于蒙汉对齐词典的未知单词处理方法是用单词对齐信息代替未知单词。最后,将原始语料库和替换未知单词的新语料库合并并训练模型。最终实验表明,在蒙汉翻译任务中,翻译的BLEU值提高了0.95个百分点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号