A Tagging-style Reordering Model for Phrase-based SMT

机译：基于短语的SMT的标记样式重排序模型

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

For current statistical machine translation system, reordering is still a major problem for language pairs like Chinese-English, where the source and target language have significant word order differences. In this paper we propose a novel tagging-style reordering model. Our model converts the reordering problem into a sequence labeling problem, i.e. a tagging task. For the given source sentence, we assign each source token a label which contains the reordering information for that token. We also design an unaligned word tag so that the unaligned word phenomenon is automatically covered in the proposed model. Our reordering model is conditioned on the whole source sentence. Hence it is able to catch long dependencies in the source sentence. The decoder makes use of the tagging information as soft constraints so that in the test phase (during translation) our model is very efficient. The model training on large scale tasks requests notably amounts of computational resources. We carried out experiments on five Chinese-English NIST tasks trained with BOLT data. Results show that our model improves the baseline system by 0.98 Bleu 1.21 Ter on average.

机译：对于当前的统计机器翻译系统，对于诸如中文-英语这样的语言对来说，重新排序仍然是一个主要问题，在该语言对中，源语言和目标语言的词序差异很大。在本文中，我们提出了一种新颖的标记样式重新排序模型。我们的模型将重新排序问题转换为序列标记问题，即标记任务。对于给定的源句子，我们为每个源标记分配一个标签，其中包含该标记的重新排序信息。我们还设计了一个未对齐的单词标签，以便在建议的模型中自动覆盖未对齐的单词现象。我们的重新排序模型以整个源句为条件。因此，它能够捕获源语句中的长依赖性。解码器将标签信息用作软约束，因此在测试阶段（翻译期间），我们的模型非常有效。大规模任务的模型训练特别需要大量的计算资源。我们对五项使用BOLT数据训练的汉英NIST任务进行了实验。结果表明，我们的模型平均将基线系统提高了0.98 Bleu 1.21 Ter。

著录项

来源
《Workshop on reordering for statistical machine translation》|2012年|17-26|共10页
会议地点 Mumbai(IN)
作者
Minwei FENG; Hermann NEY;
展开▼
作者单位

Human Language Technology and Pattern Recognition Group,Computer Science Department,RWTH Aachen University,Aachen, Germany;

Human Language Technology and Pattern Recognition Group,Computer Science Department,RWTH Aachen University,Aachen, Germany;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
statistical machine translation; reordering; conditional random fields;

机译：统计机器翻译；重新排序；条件随机场;

相似文献

外文文献
中文文献
专利

1. Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models [J] . Arefeh Kazemi, Antonio Toral, Andy Way, Entropy . 2017,第9期

机译：在区分重排序模型中调查分类质量和SMT性能之间的关系
2. Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models [J] . Arefeh Kazemi, Antonio Toral, Andy Way, Entropy . 2017,第9期

机译：在区分重排序模型中调查分类质量和SMT性能之间的关系
3. Syntax- and semantic-based reordering in hierarchical phrase-based statistical machine translation [J] . Kazemi Arefeh, Toral Antonio, Way Andy, Expert Systems with Application . 2017,第octa期

机译：基于分层短语的统计机器翻译中基于语法和语义的重新排序
4. A Tagging-style Reordering Model for Phrase-based SMT [C] . Minwei FENG, Hermann NEY Workshop on reordering for statistical machine translation . 2012

机译：基于短语的SMT标记式重新排序模型
5. Phrase-based vector space model in document retrieval. [D] . Mao, Wenlei. 2003

机译：文档检索中基于短语的向量空间模型。
6. Free-text medical document retrieval via phrase-based vector space model. [O] . Wenlei Mao, Wesley W. Chu 2002

机译：通过基于短语的向量空间模型检索自由文本医学文献。
7. A Tagging-style Reordering Model for Phrase-based SMT [O] . Feng Minwei, Ney Hermann 2012

机译：基于短语的SMT的标记样式重排序模型

A Tagging-style Reordering Model for Phrase-based SMT

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅