【24h】

Error Analysis of English-Chinese Machine Translation

机译:英汉机器翻译的错误分析

获取原文

摘要

In order to explore a practical way of improving machine translation (MT) quality, the error types and distribution of MT results have to be analyzed first. This paper analyzed English-Chinese MT errors from the perspective of naming-telling clause (NT clause, hereafter). Two types of text were input to get the MT output: one was to input the whole original English sentences into an MT engine; the other was to parse English sentences into English NT clauses, and then input these clauses into the MT engine in order. The errors of MT output are categorized into three classes: incorrect lexical choices, structural errors and component omissions. Structural errors are further divided into SV-structure errors and non-SV-structure errors. The analyzed data shows firstly, the major errors are structural errors, in which non-SV-structural errors account for a larger proportion; secondly, translation errors decrease significantly after English sentences are parsed into NT clauses. This result reveals that non-SV clauses are the main source of MT errors, and suggests that English long sentences should be parsed into NT clauses before they are translated.
机译:为了探索提高机器翻译(MT)质量的实用方法,必须首先分析MT结果的错误类型和分布。本文从命名-从句(以下简称NT从句)的角度分析了英汉MT错误。输入了两种类型的文本以获取MT的输出:一种是将整个原始英语句子输入到MT引擎中;另一种是将完整的原始英语句子输入到MT引擎中。另一种是将英语句子解析为英语NT子句,然后将这些子句按顺序输入到MT引擎中。 MT输出的错误可分为三类:错误的词汇选择,结构错误和组件遗漏。结构错误进一步分为SV结构错误和非SV结构错误。分析数据表明,首先,主要误差是结构误差,其中非SV结构误差占较大比例。其次,将英语句子解析为NT子句后,翻译错误显着降低。该结果表明,非SV子句是MT错误的主要来源,并建议英语长句子在翻译前应先解析为NT子句。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号