首页> 外文学位 >Hybrid System Combination for Machine Translation: An Integration of Phrase-level and Sentence-level Combination Approaches.
【24h】

Hybrid System Combination for Machine Translation: An Integration of Phrase-level and Sentence-level Combination Approaches.

机译:机器翻译的混合系统组合:短语级和句子级组合方法的集成。

获取原文
获取原文并翻译 | 示例

摘要

Given the wide range of successful statistical MT approaches that have emerged recently, it would be beneficial to take advantage of their individual strengths and avoid their individual weaknesses. Multi-Engine Machine Translation (MEMT) attempts to do so by either fusing the output of multiple translation engines or selecting the best translation among them, aiming to improve the overall translation quality. In this thesis, we propose to use the phrase or the sentence as our combination unit instead of the word; three new phrase-level models and one sentence-level model with novel features are proposed. This contrasts with the most popular system combination technique to date which relies on word-level confusion network decoding.;Among the three new phrase-level models, the first one utilizes source sentences and target translation hypotheses to learn hierarchical phrases --- phrases that contain subphrases (Chiang 2007). It then re-decodes the source sentences using the hierarchical phrases to combine the results of multiple MT systems. The other two models we propose view combination as a paraphrasing process and use paraphrasing rules. The paraphrasing rules are composed of either string-to-string paraphrases or hierarchical paraphrases, learned from monolingual word alignments between a selected best translation hypothesis and other hypotheses. Our experimental results show that all of the three phrase-level models give superior performance in BLEU compared with the best single translation engine. The two paraphrasing models outperform the re-decoding model and the confusion network baseline model.;The sentence-level model exploits more complex syntactic and semantic information than the phrase-level models. It uses consensus, argument alignment, a supertag-based structural language model and a syntactic error detector. We use our sentence-level model in two ways: the first selects a translated sentence from multiple MT systems as the best translation to serve as a backbone for paraphrasing process; the second makes the final decision among all fused translations generated by the phrase-level models and all translated sentences of multiple MT systems. We proposed two novel hybrid combination structures for the integration of phrase-level and sentence-level combination frameworks in order to utilize the advantages of both frameworks and provide a more diverse set of plausible fused translations to consider.
机译:鉴于最近出现了许多成功的统计MT方法,利用它们的自身优势并避免其自身的劣势将是有益的。多引擎机器翻译(MEMT)试图通过融合多个翻译引擎的输出或在其中选择最佳翻译来做到这一点,旨在提高整体翻译质量。在本文中,我们建议使用短语或句子作为组合单位,而不是单词。提出了三个新的短语级模型和一个具有新颖特征的句子级模型。这与迄今为止最流行的依赖词级混淆网络解码的系统组合技术形成了鲜明的对比。在三个新的短语级模型中,第一个使用源句子和目标翻译假设来学习层次短语-包含子词(Chiang 2007)。然后,它使用分层短语重新解码源句子,以组合多个MT系统的结果。我们建议的其他两个模型将组合视为释义过程,并使用释义规则。释义规则由字符串对字符串的释义或层次的释义组成,这些短语是从选定的最佳翻译假设与其他假设之间的单语单词对齐中获悉的。我们的实验结果表明,与最佳单翻译引擎相比,这三个短语级模型在BLEU中均具有出色的性能。这两个释义模型的性能优于重新解码模型和混淆网络基线模型。句子级别的模型比短语级别的模型利用更复杂的句法和语义信息。它使用共识,参数对齐,基于超标签的结构语言模型和语法错误检测器。我们通过两种方式使用句子级别模型:第一种方法是从多个MT系统中选择一个翻译后的句子作为最佳翻译,以作为释义过程的基础。第二个在短语级模型生成的所有融合翻译以及多个MT系统的所有翻译句子中做出最终决定。我们提出了两种新颖的混合组合结构,用于短语级和句子级组合框架的集成,以便利用两个框架的优势,并提供一组更多样化的可能的融合翻译来考虑。

著录项

  • 作者

    Ma, Wei-Yun.;

  • 作者单位

    Columbia University.;

  • 授予单位 Columbia University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 159 p.
  • 总页数 159
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号