首页> 外文会议>2011 International Conference on Innovations in Information Technology >An overview of the challenges and progress in PeEn-SMT: First large scale Persian-English SMT system
【24h】

An overview of the challenges and progress in PeEn-SMT: First large scale Persian-English SMT system

机译:PeEn-SMT的挑战和进展概述:第一个大规模的波斯英语SMT系统

获取原文

摘要

This paper documents recent work carried out for PeEn-SMT, our Statistical Machine Translation system for translation between the English-Persian language pair. We give details of our previous SMT system, and present our current development of significantly larger corpora. We explain how recent tests using much larger corpora helped to evaluate problems in parallel corpus alignment, corpus content, and how matching the domains of PeEn-SMT's components affect translation outcome. We then focus on combining corpora and approaches to improve test data, showing details of experimental setup, together with a number of experiment results and comparisons between them. We show how one combination of corpora gave us a metric score outperforming Google Translate for the English-to-Persian translation. Finally, we outline areas of our intended future work, and how we plan to improve the performance of our system to achieve higher metric scores, and ultimately to provide accurate, reliable language translation.
机译:本文记录了为PeEn-SMT所做的最新工作,PeEn-SMT是我们的统计机器翻译系统,用于英语-波斯语对之间的翻译。我们提供了以前的SMT系统的详细信息,并介绍了我们目前正在开发的大型语料库。我们将说明最近使用更大语料库进行的测试如何帮助评估平行语料库对齐,语料库内容以及PeEn-SMT组件的域匹配如何影响翻译结果的问题。然后,我们专注于结合语料库和方法来改善测试数据,显示实验设置的详细信息,以及许多实验结果以及它们之间的比较。我们将说明语料库的一种组合如何为我们提供英式至波斯语翻译的指标得分优于Google翻译。最后,我们概述了预期的未来工作领域,以及我们如何计划改善系统性能以实现更高的度量标准分数,并最终提供准确,可靠的语言翻译。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号