首页> 外文会议>Conference on empirical methods in natural language processing;EMNLP 2011 >Feature-Rich Language-Independent Syntax-Based Alignment for Statistical Machine Translation
【24h】

Feature-Rich Language-Independent Syntax-Based Alignment for Statistical Machine Translation

机译:统计机器翻译的基于功能丰富的语言独立语法的对齐方式

获取原文

摘要

We present an accurate word alignment algorithm that heavily exploits source and target-language syntax. Using a discriminative framework and an efficient bottom-up search algorithm, we train a model of hundreds of thousands of syntactic features. Our new model (1) helps us to very accurately model syntactic transformations between languages; (2) is language-independent; and (3) with automatic feature extraction, assists system developers in obtaining good word-alignment performance off-the-shelf when tackling new language pairs. We analyze the impact of our features, describe inference under the model, and demonstrate significant alignment and translation quality improvements over already-powerful baselines trained on very large corpora. We observe translation quality improvements corresponding to 1.0 and 1.3 BLEU for Arabic-English and Chinese-English, respectively.
机译:我们提出了一种精确的单词对齐算法,该算法大量利用了源语言和目标语言语法。使用判别框架和有效的自下而上的搜索算法,我们训练了包含数十万个句法特征的模型。我们的新模型(1)帮助我们非常准确地对语言之间的句法转换进行建模; (2)与语言无关; (3)具有自动特征提取功能,可帮助系统开发人员在解决新语言对时,获得现成的良好字对齐性能。我们分析了功能的影响,描述了模型下的推论,并展示了在大型语料库上已经训练有力的基线之上的显着对齐和翻译质量改进。我们发现阿拉伯语-英语和中文-英语的翻译质量分别提高了1.0和1.3 BLEU。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号