首页> 外文学位 >Automatic improvement of machine translation systems.
【24h】

Automatic improvement of machine translation systems.

机译:自动改进机器翻译系统。

获取原文
获取原文并翻译 | 示例

摘要

Achieving high translation quality remains the most daunting challenge Machine Translation (MT) systems currently face. Researchers have explored a variety of methods for including translator feedback in the MT loop. However, most MT systems have failed to incorporate post-editing efforts beyond the addition of corrected translations to the parallel training data for Example-Based and Statistical systems or to a translation memory database. This thesis describes a novel approach that utilizes post-editing information to automatically improve the underlying rules and lexical entries of a Transfer-Based MT system. This process can be divided into two main steps. First, an online translation correction tool allows for easy error diagnosis and implicit error categorization. Then, an Automatic Rule Refiner performs error remediation by tracing errors back to the problematic rules and lexical entries and executing repairs that are mostly lexical and morpho-syntactic in nature (such as word-order, missing constituents or incorrect agreement in transfer rules). This approach directly improves the intelligibility of corrected MT output and, more significantly, it generalizes over unseen data, providing improved MT output for similar sentences that have not been corrected.; Experimental results on an English-Spanish MT system show that automatic rule refinements triggered by bilingual speaker corrections successfully translate unseen data that was incorrectly translated by the original, unrefined grammar. Improvements on translation quality over a baseline, as measured by standard automatic evaluation metrics, are statistically significant on a paired two-tailed t-test (p = 0.0051).; One practical application of this research is extending and refining relatively small translation grammars for resource-poor languages, such as Mapudungun and Quechua, into a major language, such as English or Spanish. Initial experimental results on a Spanish Mapudungun MT system show that rule refinement operations generalize well to a different language pair and are able to correct errors in the grammar and the lexicon.
机译:实现高翻译质量仍然是机器翻译(MT)系统当前面临的最艰巨的挑战。研究人员已经探索了多种方法来将翻译器反馈包括在MT循环中。但是,除了将正确的翻译添加到基于示例和统计系统的并行训练数据或翻译记忆库之外,大多数MT系统都无法将后期编辑工作纳入其中。本文介绍了一种新颖的方法,该方法利用后期编辑信息来自动改进基于Transfer的MT系统的基本规则和词汇条目。此过程可以分为两个主要步骤。首先,在线翻译纠正工具可轻松进行错误诊断和隐式错误分类。然后,自动规则优化程序通过将错误追溯到有问题的规则和词法条目并执行本质上大多为词法和词法语法的修复(例如字序,缺少组成部分或传输规则中的不正确约定)来执行错误纠正。这种方法直接提高了校正后的MT输出的清晰度,更重要的是,它泛化了看不见的数据,从而为未校正的相似句子提供了改进的MT输出。在英语-西班牙语MT系统上的实验结果表明,由双语说话者更正触发的自动规则细化成功地翻译了由原始的,未精炼的语法错误翻译的看不见的数据。通过标准的自动评估指标衡量,在基线上翻译质量的提高在配对的两尾t检验中具有统计学意义(p = 0.0051)。这项研究的一个实际应用是将资源匮乏的语言(如Mapudungun和Quechua)的相对较小的翻译语法扩展和完善为英语或西班牙语等主要语言。在西班牙Mapudungun MT系统上进行的初步实验结果表明,规则细化操作可以很好地推广到不同的语言对,并且能够纠正语法和词典中的错误。

著录项

  • 作者

    Font Llitjos, Ariadna.;

  • 作者单位

    Carnegie Mellon University.;

  • 授予单位 Carnegie Mellon University.;
  • 学科 Engineering Mechanical.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 197 p.
  • 总页数 197
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号