Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the e_0-norm

机译：较小的比对模型以提供更好的翻译：具有e_0-norm的无监督词比对

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems. Although many models have surpassed them in accuracy, none have supplanted them in practice. In this paper, we propose a simple extension to the IBM models: an e_0 prior to encourage sparsity in the word-to-word translation model. We explain how to implement this extension efficiently for large-scale data (also released as a modification to GIZA++) and demonstrate, in experiments on Czech, Arabic, Chinese, and Urdu to English translation, significant improvements over IBM Model 4 in both word alignment (up to +6.7 F1) and translation quality (up to +1.4 B ).

机译：发明后二十年，GIZA ++工具箱中广泛使用的IBM基于单词的翻译模型仍然是单词对齐的主要方法，并且是许多统计翻译系统不可或缺的一部分。尽管许多模型的准确性都超过了它们，但实际上没有一个模型可以取代它们。在本文中，我们提出了对IBM模型的简单扩展：鼓励单词间转换模型中的稀疏性的e_0。我们将说明如何有效地针对大规模数据（也作为对GIZA ++的修改而发布）有效地实现此扩展，并在捷克语，阿拉伯语，中文和乌尔都语到英语翻译的实验中证明在单词对齐方面都比IBM Model 4有了显着改进（最高+6.7 F1）和翻译品质（最高+1.4 B）。

著录项

来源
《Annual meeting of the Association for Computational Linguistics;ACL 2012》|2012年|p.311-319|共9页
会议地点
作者
Ashish Vaswani; Liang Huang; David Chiang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Semantic Structural Alignment of Neural Representational Spaces Enables Translation between English and Chinese Words [J] . Benjamin D. Zinszer, Andrew J. Anderson, Olivia Kang, Journal of Cognitive Neuroscience . 2016,第11期

机译：神经表示空间的语义结构对齐使英汉单词之间的翻译成为可能
2. Unsupervised Query Segmentation Using Monolingual Word Alignment Method [J] . Dayong Wu, Yu Zhang, Ting Liu Computer and information science . 2012,第1期

机译：单语言单词对齐方法的无监督查询细分
3. Unsupervised Query Segmentation Using Monolingual Word Alignment Method [J] . Dayong Wu, Yu Zhang, Ting Liu Computer and Information Science . 2011,第1期

机译：单语言单词对齐方法的无监督查询细分
4. Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the e_0-norm [C] . Ashish Vaswani, Liang Huang, David Chiang Annual meeting of the Association for Computational Linguistics . 2012

机译：更好的转换较小的对齐模型：与e_0-norm的无监督单词对齐
5. On Word Alignment Models for Statistical Machine Translation. [D] . Zhao, Shaojun. 2011

机译：关于统计机器翻译的单词对齐模型。
6. MICAN : a protein structure alignment algorithm that can handle Multiple-chains Inverse alignments Cα only models Alternative alignments and Non-sequential alignments [O] . Shintaro Minami, Kengo Sawada, George Chikenji 2013

机译：MICAN：一种蛋白质结构比对算法可以处理多链反向比对仅Cα模型替代比对和非顺序比对
7. Using word-dependent transition models in HMM-based word alignment for statistical machine translation [O] . Xiaodong He 2007

机译：在基于Hmm的词对齐中使用依赖于词的转换模型来进行统计机器转换

Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the e_0-norm

摘要

著录项

相似文献

相关主题

期刊订阅