首页> 外文会议>9th International conference on language resources and evaluation >English-French Verb Phrase Alignment in Europarl for Tense Translation Modeling
【24h】

English-French Verb Phrase Alignment in Europarl for Tense Translation Modeling

机译:英语 - 法语动词短语对齐在欧金尔尔的时态翻译建模

获取原文

摘要

This paper presents a method for verb phrase (VP) alignment in an English/French parallel corpus and its use for improving statistical machine translation (SMT) of verb tenses. The method starts from automatic word alignment performed with GIZA++, and relies on a POS tagger and a parser, in combination with several heuristics, in order to identify non-contiguous components of VPs, and to label the aligned VPs with their tense and voice on each side. This procedure is applied to the Europarl corpus, leading to the creation of a smaller, high-precision parallel corpus with about 320 000 pairs of finite VPs, which is made publicly available. This resource is used to train a tense predictor for translation from English into French, based on a large number of surface features. Three MT systems are compared: (1) a baseline phrase-based SMT; (2) a tense-aware SMT system using the above predictions within a factored translation model; and (3) a system using oracle predictions from the aligned VPs. For several tenses, such as the French imparfait, the tense-aware SMT system improves significantly over the baseline and is closer to the oracle system.
机译:本文介绍了在英语/法语平行语料库动词短语(VP)对准及改善的动词时态统计机器翻译(SMT)使用的方法。从自动字对齐的方法,开始于GIZA ++执行,并且与几个试探法依赖于一个POS标注器和解析器,组合,以便识别VPS的非连续分量,并以标记对齐的VP与他们的紧张和语音上每一面。此过程被施加到Europarl语料库,导致建立一个较小的,高精度的平行语料库与约320000双有限的VP,它是由可公开获得的。此资源用于训练的翻译从英语译成法语紧张的预测的基础上,大量的表面特征。三个MT系统进行了比较:基于短语(1)的基线SMT; (2)使用一个因式分解翻译模型内上述预测的紧张感知SMT系统;和(3)使用从对齐的VP oracle的预测的系统。对于一些时态,如法国imparfait,紧张的感知SMT系统将比基线显著提高,更接近Oracle系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号