首页> 外文会议>Workshop on reordering for statistical machine translation >A Tagging-style Reordering Model for Phrase-based SMT
【24h】

A Tagging-style Reordering Model for Phrase-based SMT

机译:基于短语的SMT的标记样式重排序模型

获取原文
获取原文并翻译 | 示例

摘要

For current statistical machine translation system, reordering is still a major problem for language pairs like Chinese-English, where the source and target language have significant word order differences. In this paper we propose a novel tagging-style reordering model. Our model converts the reordering problem into a sequence labeling problem, i.e. a tagging task. For the given source sentence, we assign each source token a label which contains the reordering information for that token. We also design an unaligned word tag so that the unaligned word phenomenon is automatically covered in the proposed model. Our reordering model is conditioned on the whole source sentence. Hence it is able to catch long dependencies in the source sentence. The decoder makes use of the tagging information as soft constraints so that in the test phase (during translation) our model is very efficient. The model training on large scale tasks requests notably amounts of computational resources. We carried out experiments on five Chinese-English NIST tasks trained with BOLT data. Results show that our model improves the baseline system by 0.98 Bleu 1.21 Ter on average.
机译:对于当前的统计机器翻译系统,对于诸如中文-英语这样的语言对来说,重新排序仍然是一个主要问题,在该语言对中,源语言和目标语言的词序差异很大。在本文中,我们提出了一种新颖的标记样式重新排序模型。我们的模型将重新排序问题转换为序列标记问题,即标记任务。对于给定的源句子,我们为每个源标记分配一个标签,其中包含该标记的重新排序信息。我们还设计了一个未对齐的单词标签,以便在建议的模型中自动覆盖未对齐的单词现象。我们的重新排序模型以整个源句为条件。因此,它能够捕获源语句中的长依赖性。解码器将标签信息用作软约束,因此在测试阶段(翻译期间),我们的模型非常有效。大规模任务的模型训练特别需要大量的计算资源。我们对五项使用BOLT数据训练的汉英NIST任务进行了实验。结果表明,我们的模型平均将基线系统提高了0.98 Bleu 1.21 Ter。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号