首页> 外文学位 >Generate and repair machine translation.
【24h】

Generate and repair machine translation.

机译:生成并修复机器翻译。

获取原文
获取原文并翻译 | 示例

摘要

We propose Generate and Repair Machine Translation (GRMT), a constraint-based approach to machine translation (MT) that focuses on accurate translation output. The architecture of GRMT was designed to take advantage of, and have advantages over, the three classic strategies (Direct MT, Interlingual MT and Transfer MT), the nonlinguistic information strategies (Example-Based MT and Statistics-Based MT), and the hybrid strategies (Knowledge-Based MT and Shake-and-Bake MT) with respect to several translation aspects: simplicity, accuracy and multilingualism.; GRMT performs the translation by generating a Translation Candidate (TC), verifying the syntax and semantics of the TC, and repairing the TC when required. GRMT comprises three modules: Analysis Lite Machine Translation (ALMT), Translation Candidate Evaluation (TCE), and Repair and Iterate (RI).; In generating the TC, GRMT refines the scope of translation choices of each input word by taking into account the differences between languages in a unique way. In selecting an appropriate word for each input word, GRMT considers the semantic relationship between words. This semantic relationship is based on the Word Association (WordAsso) number. (WordAsso). WordAsso number is assigned to word class. Words are classified according to the meaning of words and their usage. Word classification is designed and used not only in the word selection process but also in the classifier selection process and in semantic representation.; GRMT is highly modular and extendible in the following respects: each component is separated, not only the translation process components (ALMT, TCE, RI), but also in the knowledge-bases, each component can be extended easily to a larger domain. The adding of new languages is also possible since the source language (SL) and the target language (TL) are treated separately. The SL and the TL are connected via the SL-TL dictionary which contains simple information and is manageable.; An English-Thai translation system has been implemented to illustrate the performance of GRMT. The system has been developed and run under SWI-Prolog 3.2.8. The English and Thai grammars have been developed based on the Head-Driven Phrase Structure Grammar (HPSG) and implemented on the Attribute Logic Engine (ALE).; This English-Thai MT system was evaluated and it performs in the way we intended. ALMT generated acceptable translations (grammatically correct, correct word usage and convey the original meaning) for 47 out of the 90 sentences in the test corpus without repair. WE and RI improved 15 sentences using our current HPSG based grammars and lexicons. Twenty-one sentences which contain logical connections are first separated into linguistic units before the repair can be performed due to a current inadequacy in HPSG's semantic representation. However, each linguistic unit was then repaired successfully. Seven sentences faced with the problems of adding linking words and classifiers in Thai also require further research in order to develop ways to repair these sentences.
机译:我们建议生成和修复机器翻译(GRMT),这是一种基于约束的机器翻译(MT)方法,其重点是准确的翻译输出。 GRMT的体系结构旨在利用并优于三种经典策略(直接MT,语言间MT和传输MT),非语言信息策略(基于示例的MT和基于统计的MT)以及混合策略在几个翻译方面的策略(基于知识的MT和“摇一摇” MT):简单,准确和多语言。 GRMT通过生成翻译候选(TC),验证TC的语法和语义以及在需要时修复TC来执行翻译。 GRMT包含三个模块:分析精简版机器翻译(ALMT),翻译候选评估(TCE)和修复与迭代(RI)。在生成TC时,GRMT通过以独特的方式考虑语言之间的差异来优化每个输入单词的翻译选择范围。在为每个输入单词选择合适的单词时,GRMT考虑单词之间的语义关系。此语义关系基于单词关联(WordAsso)编号。 (WordAsso)。 WordAsso编号分配给单词类别。根据单词的含义及其用法对单词进行分类。不仅在单词选择过程中而且在分类器选择过程和语义表示中设计和使用单词分类。 GRMT在以下方面具有高度的模块化和可扩展性:每个组件都是单独的,不仅翻译过程组件(ALMT,TCE,RI)而且在知识库中都可以轻松地扩展到更大的领域。由于源语言(SL)和目标语言(TL)分开处理,因此也可以添加新语言。 SL和TL通过SL-TL词典连接,该词典包含简单信息并且易于管理。已实施英语-泰语翻译系统来说明GRMT的性能。该系统已开发并在SWI-Prolog 3.2.8下运行。英语和泰语语法是基于头驱动短语结构语法(HPSG)开发的,并在属性逻辑引擎(ALE)中实现的。我们对该英语-泰国MT系统进行了评估,并按照我们预期的方式执行。 ALMT生成了测试语料库中90个句子中的47个的可接受的翻译(语法正确,正确的单词用法并传达了原始含义),而没有修复。 WE和RI使用我们当前基于HPSG的语法和词典改进了15个句子。由于当前HPSG语义表示的不足,在执行修复之前,首先将包含逻辑连接的21个句子分成语言单元。但是,每个语言单元都被成功修复。面临着在泰语中添加链接词和分类器的问题的七个句子也需要进一步研究,以开发出修复这些句子的方法。

著录项

  • 作者

    Naruedomkul, Kanlaya.;

  • 作者单位

    The University of Regina (Canada).;

  • 授予单位 The University of Regina (Canada).;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2000
  • 页码 p.6568
  • 总页数 276
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号