首页> 外文会议>First workshop on algorithms and resources for modelling of dialects and language varieties 2011 >Dialectal to Standard Arabic Paraphrasing to Improve Arabic-English Statistical Machine Translation
【24h】

Dialectal to Standard Arabic Paraphrasing to Improve Arabic-English Statistical Machine Translation

机译:以标准阿拉伯语释义的方言,以改善阿拉伯英语统计机器翻译

获取原文
获取原文并翻译 | 示例

摘要

This paper is about improving the quality of Arabic-English statistical machine translation (SMT) on dialectal Arabic text using morphological knowledge. We present a light-weight rule-based approach to producing Modern Standard Arabic (MSA) paraphrases of dialectal Arabic out-of-vocabulary (OOV) words and low frequency words. Our approach extends an existing MSA analyzer with a small number of morphological clitics, and uses transfer rules to generate paraphrase lattices that are input to a state-of-the-art phrase-based SMT system. This approach improves BLEU scores on a blind test set by 0.56 absolute BLEU (or 1.5% relative). A manual error analysis of translated dialectal words shows that our system produces correct translations in 74% of the time for OOVs and 60% of the time for low frequency words.
机译:本文旨在提高利用形态学知识对方言阿拉伯语文本进行阿拉伯语-英语统计机器翻译(SMT)的质量。我们提出一种轻量级的基于规则的方法来生成方言阿拉伯语词汇(OOV)单词和低频单词的现代标准阿拉伯语(MSA)释义。我们的方法扩展了具有少量形态学气候的现有MSA分析仪,并使用传输规则生成复述晶格,并将其输入到基于短语的SMT系统中。这种方法将盲测集上的BLEU分数提高了0.56绝对BLEU(或1.5%相对)。对已翻译的方言词的人工错误分析表明,对于OOV,我们的系统在74%的时间中产生正确的翻译,而在低频单词中,则有60%的时间产生正确的翻译。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号