首页> 外文期刊>International Journal of Intelligent Systems Technologies and Applications >Improving English-Arabic statistical machine translation with morpho-syntactic and semantic word class
【24h】

Improving English-Arabic statistical machine translation with morpho-syntactic and semantic word class

机译:用Morpho语法和语义词类改进英语 - 阿拉伯语统计机器翻译

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we present a new method for the extraction and integrating of morpho-syntactic and semantic word classes in a statistical machine translation (SMT) context to improve the quality of English-Arabic translation. It can be applied across different statistical machine translations and with languages that have complicated morphological paradigms. In our method, we first identify morpho-syntactic word classes to build up our statistical language model. Then, we apply a semantic word clustering algorithm for English. The obtained semantic word classes are projected from the English side to the featured Arabic side. This projection is based on available word alignment provided by the alignment step using GIZA++ tool. Finally, we apply a new process to incorporate semantic classes in order to improve the SMT quality. We show its efficacy on small and larger English to Arabic translation tasks. The experimental results show that introducing morpho-syntactic and semantic word classes achieves 7.7% of relative improvement on the BLEU score.
机译:在本文中,我们在统计机器翻译(SMT)语境中提取和整合了语音句法和语义语义类的新方法,以提高英语翻译的质量。它可以应用于不同的统计机器翻译和具有复杂形态范式的语言。在我们的方法中,我们首先识别Morpho-Syntactic Word类以建立我们的统计语言模型。然后,我们应用一个用于英语的语义词聚类算法。获得的语义词类从英文侧投射到特色阿拉伯语方面。该投影基于使用Giza ++工具提供的对齐步骤提供的可用词对齐。最后,我们应用一个新的过程来合并语义课程,以提高SMT质量。我们对阿拉伯语翻译任务的小型和更大英语展示了它的效力。实验结果表明,介绍了杂语和语义上的语义课程的相对改善的7.7%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号