Tree-adjoining machine translation.

机译：树连接机器翻译。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Machine Translation (MT) is the task of translating a document from a source language (e.g., Chinese) into a target language (e.g., English) via computer. State-of-the-art statistical approaches to MT use large collections of human-translated documents as training material, gathering statistics on the patterns of correspondence between languages according to the features specified by the translation model. Using this bilingual translation model in conjunction with a target language model, created by gathering statistics from a large monolingual corpus, a new document in the source language can be automatically translated into its target-language equivalent with surprising accuracy.;Much MT research focuses on types of the patterns and features to include in a translation model. Recent statistical MT models have used syntax trees to enforce grammaticality, but the currently popular tree substitution models only memorize sequences of words or constituents, specifying exactly what phrases to use and exactly what trees are grammatical, which does not generalize well. Adding the operation of tree-adjoining provides the freedom to splice additional information into an existing grammatical tree. An adjoining translation model allows general, linguistically-motivated translation patterns to be learned without the clutter of endless variations of optional material. The appropriate modifiers, such as adjectives, adverbs, and prepositional phrases, can be grafted into these core patterns as needed to translate details. We show that the increased generalization power provided by adjoining, when used carefully, improves MT quality without becoming computationally intractable.;In this thesis, we describe challenges encountered by both word-sequence-based and syntax-tree-based MT systems today, and present an in-depth, quantitative comparison of both models. Then we describe a novel model for statistical MT which addresses these challenges using a synchronous tree-adjoining grammar. We introduce a method of converting these grammars to a weakly equivalent tree transducer for decoding. Then we present a method for learning the rules and associated probabilities of this grammar from aligned tree/string training data, and empirically analyze important characteristics of the resulting model, considering and evaluating many variations. Finally, our results show that adjoining delivers a consistent improvement over a baseline statistical syntax-based MT model on both medium and large-scale MT tasks using several language pairs.

机译：机器翻译（MT）是通过计算机将文档从源语言（例如中文）转换为目标语言（例如英语）的任务。 MT的最新统计方法使用大量的人工翻译文档作为培训材料，根据翻译模型指定的功能收集语言之间对应模式的统计信息。通过将这种双语翻译模型与目标语言模型结合使用，该模型是通过从大型单语语料库中收集统计数据而创建的，源语言中的新文档可以自动以惊人的准确性翻译成其目标语言版本。翻译模型中包含的模式和特征类型。最近的统计MT模型已经使用语法树来增强语法，但是当前流行的树替换模型仅存储单词或成分的序列，确切指定要使用的短语以及语法是什么树，这不能很好地概括。添加邻接树的操作可以自由地将其他信息拼接到现有的语法树中。相邻的翻译模型允许学习通用的，基于语言的翻译模式，而不会造成可选材料无休止的变化。可以根据需要将适当的修饰语（例如形容词，副词和介词短语）移植到这些核心模式中，以翻译细节。我们显示出，通过谨慎使用，邻接所提供的增强泛化能力可以提高MT质量，而不会变得难以计算。。在本文中，我们描述了当今基于词序和基于语法树的MT系统所面临的挑战，以及目前对这两种模型进行了深入，定量的比较。然后，我们描述了一种用于统计MT的新颖模型，该模型使用同步树邻接语法解决了这些挑战。我们介绍了一种将这些语法转换为弱等效树换能器以进行解码的方法。然后，我们提出了一种从对齐的树/字符串训练数据中学习该语法的规则和相关概率的方法，并通过经验方法分析了所得模型的重要特征，并考虑和评估了许多变体。最后，我们的结果表明，在使用几种语言对的中型和大型MT任务上，邻接关系在基于基线统计语法的MT模型上提供了一致的改进。

著录项

作者
DeNeefe, Steve.;
展开▼
作者单位

University of Southern California.;

展开▼
授予单位 University of Southern California.;
学科 Computer Science.
学位 Ph.D.
年度 2011
页码 171 p.
总页数 171
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Tree parsing for tree-adjoining machine translation [J] . MATTHIAS BUECHSE, HEIKO VOGLER, MARK-JAN NEDERHOF Journal of logic and computation . 2014,第2期

机译：树木解析以进行邻接树的机器翻译
2. Hybrid approaches to machine translation. [J] . M. Ivanović Novi Sad Computing reviews . 2017,第4期

机译：机器翻译的混合方法。
3. Philipp Koehn: Statistical Machine Translation. [J] . Applied Linguistics . 2011,第3期

机译：Philipp Koehn：统计机器翻译。
4. When less is more in Neural Quality Estimation of Machine Translation. An industry case study [C] . Dimitar Shterionov, Felix do Carmo, Joss Moorkens, Machine translation summit . 2019

机译：机器翻译的神经质量估计中的少则多。行业案例研究
5. Machine learning approaches for dealing with limited bilingual training data in statistical machine translation. [D] . Haffari, Gholamreza. 2009

机译：在统计机器翻译中用于处理有限的双语培训数据的机器学习方法。
6. Immunoregulatory activity of a T-cell receptor alpha chain demonstrated by in vitro transcription and translation. [O] . T Onda, T Brunner, H Messier, 2019

机译：通过体外转录和翻译证明了T细胞受体α链的免疫调节活性。
7. Probabilistic synchronous tree-adjoining grammars for machine translation [O] . Stuart M. Shieber 2007

机译：机器翻译的概率同步树相邻语法
8. Soviet Developments in Information Processing and Machine Translation. Problems of Word-Order in Russian-Chinese Machine Translation and Their Solutions. [R] . Yung-chuan, L. 1960

机译：苏联信息处理与机器翻译的发展。俄汉机器翻译中的词序问题及其解决方法。

Tree-adjoining machine translation.

摘要

著录项

相似文献

相关主题

期刊订阅