首页> 外文会议>Information Science and Technology (ICIST), 2012 International Conference on >Syntax encapsulated phrase model for statistical machine translation
【24h】

Syntax encapsulated phrase model for statistical machine translation

机译:用于统计机器翻译的语法封装短语模型

获取原文
获取原文并翻译 | 示例

摘要

In the past few years, much attention has been paid on extending phrase-based statistical machine translation with syntactic structures. In this paper we introduce a novel syntax encapsulated phrase(SEP) model, in which treebank tag sequences are employed to decorate the bilingual phrase pairs. We use tag sequences, instead of phrase pairs, to train the lexicalized reordering model. Since the number of treebank tags is much smaller than the number of words, the tag sequence based reordering model is smaller and more accurate than the phrase based reordering model. Experiments were carried out on four types of models: the phrase model, the hierarchical phrase model, the POS tag encapsulated phrase(PTEP) model and the syntactic tag encapsulated phrase(STEP) model. The STEP model obtained higher BLEU-4 score than other models on NIST 2005 MT task.
机译:在过去的几年中,在扩展基于短语的具有语法结构的统计机器翻译方面已经引起了很多关注。在本文中,我们介绍了一种新颖的语法封装短语(SEP)模型,其中使用树库标签序列来装饰双语短语对。我们使用标签序列而不是短语对来训练词汇化的重新排序模型。由于树库标签的数量远少于单词的数量,因此基于标签序列的重排序模型比基于短语的重排序模型更小且更准确。对四种类型的模型进行了实验:短语模型,分层短语模型,POS标签封装短语(PTEP)模型和语法标签封装短语(STEP)模型。在NIST 2005 MT任务上,STEP模型获得的BLEU-4得分高于其他模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号