Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment

Marine Carpuat; Yuval Marton; Nizar Habash

首页> 外文期刊>Machine translation >Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment

【24h】

Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment

机译：通过重新排列词后对齐主题以进行单词对齐，改进了阿拉伯语到英语的统计机器翻译

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study challenges raised by the order of Arabic verbs and their subjects in statistical machine translation (SMT). We show that the boundaries of post-verbal subjects (VS) are hard to detect accurately, even with a state-of-the-art Arabic dependency parser. In addition, VS constructions have highly ambiguous reordering patterns when translated to English, and these patterns are very different for matrix (main clause) VS and non-matrix (subordinate clause) VS. Based on this analysis, we propose a novel method for leveraging VS information in SMT: we reorder VS constructions into pre-verbal (SV) order for word alignment. Unlike previous approaches to sourceside reordering, phrase extraction and decoding are performed using the original Arabic word order. This strategy significantly improves BLEU and TER scores, even on a strong large-scale baseline. Limiting reordering to matrix VS yields further improvements.

机译：我们研究了阿拉伯语动词及其主题在统计机器翻译（SMT）中的提出的挑战。我们证明，即使使用最先进的阿拉伯语依赖解析器，也很难准确地检测出语言后主题（VS）的边界。此外，VS构造在翻译成英文时具有高度模糊的重新排序模式，并且这些模式对于矩阵（主子句）VS和非矩阵（从属子句）VS有很大不同。基于此分析，我们提出了一种利用SMT中的VS信息的新颖方法：将VS结构重新排序为词对齐之前的（SV）顺序。与以前的源端重新排序方法不同，短语提取和解码是使用原始阿拉伯语单词顺序执行的。即使在强大的大规模基准上，该策略也可以显着提高BLEU和TER分数。将重新排序限制为矩阵VS可带来进一步的改进。

著录项

来源
《Machine translation》 |2012年第2期|p.105-120|共16页
作者
Marine Carpuat; Yuval Marton; Nizar Habash;
展开▼
作者单位

National Research Council, 283 Alexandre-Tache Boulevard, Building CRTL, Gatineau, QC J8X 3X7. Canada;

IBM T. J. Watson Research Center, Kitchawan Road/Route 134, Yorktown Heights, NY 10598. USA;

Columbia University Center for Computational Learning Systems. 475 Riverside Drive MC 7717, New York, NY 10115, USA;

展开▼
收录信息美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
statistical machine translation; reordering; VS; post-verbal subjects; matrix subject; subject detection; word alignment; dependency parsing;

机译：统计机器翻译;重新排序;VS;言语后主体;矩阵主题主题检测;单词对齐;依赖解析;

相似文献

外文文献
中文文献
专利

1. Recursive alignment block classification technique for word reordering in statistical machine translation [J] . Marta R. Costa-jussa, Jose A. R. Fonollosa, Enric Monte Language Resources and Evaluation . 2011,第2期

机译：统计机器翻译中用于单词重排的递归对齐块分类技术
2. Improving Statistical Machine Translation Using Bayesian Word Alignment and Gibbs Sampling [J] . Mermer C., Saraclar M., Sarikaya R. Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第5期

机译：使用贝叶斯词对齐和Gibbs采样改善统计机器翻译
3. What types of word alignment improve statistical machine translation? [J] . Patrik Lambert, Simon Petitrenaud, Yanjun Ma, Machine translation . 2012,第4期

机译：哪些类型的单词对齐可改善统计机器翻译？
4. Improving Arabic-to-English Statistical Machine Translation by Reordering Post-verbal Subjects for Alignment [C] . Marine Carpuat, Yuval Marton, Nizar Habash Annual meeting of the Association for Computational Linguistics;Meeting of the Association for Computational Linguistics . 2010

机译：通过重新排列口语主题以进行对齐来改善阿拉伯语到英语的统计机器翻译
5. Improved word alignments for statistical machine translation. [D] . Fraser, Alexander. 2007

机译：改进了单词对齐，以进行统计机器翻译。
6. Improving the Alignment Quality of Consistency Based Aligners with an Evaluation Function Using Synonymous Protein Words [O] . Hsin-Nan Lin, Cédric Notredame, Jia-Ming Chang, 2011

机译：基于改进的一致性矫正器对齐质量的评价函数使用同义字蛋白
7. Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment [O] . Carpuat, Marine, Marton, Yuval, Habash, Nizar 2012

机译：通过重新排列词后对齐主题以进行单词对齐，改进了阿拉伯语到英语的统计机器翻译

Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment

摘要

著录项

相似文献

相关主题

期刊订阅