首页> 外文OA文献 >Exploiting alignment techniques in MATREX: the DCU machine translation system for IWSLT 2008
【2h】

Exploiting alignment techniques in MATREX: the DCU machine translation system for IWSLT 2008

机译:利用maTREX中的对齐技术:IWsLT 2008的DCU机器翻译系统

摘要

In this paper, we give a description of the machine translation (MT) system developed at DCU that was used for our third participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2008). In this participation, we focus on various techniques for word and phrase alignment to improve system quality. Specifically, we try out our word packing and syntax-enhanced word alignment techniques for the Chinese–English task and for the English–Chinese task for the first time. For all translation tasks except Arabic–English, we exploit linguistically motivated bilingual phrase pairs extracted from parallel treebanks. We smooth our translation tables with out-of-domain word translations for the Arabic–English and Chinese–English tasks in order to solve the problem of the high number of out of vocabulary items. We also carried out experiments combining both in-domain and out-of-domain data to improve system performance and, finally, we deploy a majority voting procedure combining a language model based method and a translation-based method for case and punctuation restoration. We participated in all the translationudtasks and translated both the single-best ASR hypotheses andudthe correct recognition results. The translation results confirm that our new word and phrase alignment techniques are often helpful in improving translation quality, and the data combination method we proposed can significantly improve system performance.
机译:在本文中,我们对在DCU开发的机器翻译(MT)系统进行了描述,该系统用于我们第三次参加国际口语翻译研讨会(IWSLT 2008)的评估活动。在这次参与中,我们专注于各种单词和短语对齐技术,以提高系统质量。具体来说,我们首次尝试了针对汉英任务和针对汉英任务的单词打包和语法增强的单词对齐技术。对于除阿拉伯语-英语以外的所有翻译任务,我们利用从并行树库中提取的语言动机双语短语对。我们使用阿拉伯语-英语和中文-英语任务的域外单词翻译来平滑翻译表,以解决词汇量过多的问题。我们还进行了结合域内和域外数据的实验,以提高系统性能,最后,我们部署了结合基于语言模型的方法和基于翻译的方法的多数表决程序,用于案例和标点符号的恢复。我们参与了所有的翻译任务,并翻译了最佳的ASR假设和正确的识别结果。翻译结果证实,我们新的单词和短语对齐技术通常有助于提高翻译质量,并且我们提出的数据组合方法可以显着提高系统性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号