Statistical Significance Tests for Machine Translation Evaluation

机译：机器翻译评估的统计显着性检验

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

If two translation systems differ differ in performanceon a test set, can we trust that this indicatesa difference in true system quality? To answer thisquestion, we describe bootstrap resampling methodsto compute statistical significance of test results,and validate them on the concrete example of theBLEU score. Even for small test sizes of only 300sentences, our methods may give us assurances thattest result differences are real.

机译：如果两个翻译系统的性能不同在测试集上，我们可以相信这表明真实系统质量有何不同？为了回答这个问题，我们描述自举重采样方法计算测试结果的统计意义，并在具体示例中对其进行验证 BLEU得分。即使只有300的小型测试句子，我们的方法可以使我们确信测试结果的差异是真实的。

著录项

来源
《;42nd Annual Meeting of the Association for Computational Linguistics》|2004年|p.1-8|共8页
会议地点
作者
Philipp Koehn;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. Significance tests of automatic machine translation evaluation metrics [J] . Ying Zhang, Stephan Vogel Machine translation . 2010,第1期

机译：自动机器翻译评估指标的意义测试
2. Function words in statistical machine-translated Chinese and original Chinese: A study into the translationese of machine translation systems [J] . Kuo Chen-li Digital scholarship in the humanities . 2019,第4期

机译：统计机器中的功能词 - 翻译的中国和原版中文：一项研究机器翻译系统的研究
3. MTIL2017: Machine Translation Using Recurrent Neural Network on Statistical Machine Translation [J] . Sainik KumarMahata, DipankarDas, SivajiBandyopadhyay Journal of Intelligent Systems . 2019,第3期

机译：MTIL2017：使用统计机器翻译的经常性神经网络的机器翻译
4. Large-scale Dictionary Construction via Pivot-based Statistical Machine Translation with Significance Pruning and Neural Network Features [C] . Raj Dabre, Chenhui Chu, Fabien Cromieres, Pacific Asia Conference on Language, Information and Computation . 2015

机译：通过具有显着修剪和神经网络功能的基于枢轴的统计机器翻译构建大规模词典
5. Modeling, Relevance in Statistical Machine Translation: Scoring Aligment, Context, and Annotations of Translation Instances. [D] . Phillips, Aaron B. 2012

机译：统计机器翻译中的建模，相关性：评分实例，上下文和翻译实例注释。
6. 3145 An Evaluation of Machine Learning and Traditional Statistical Methods for Discovery in Large-Scale Translational Data [O] . Megan C Hollister, Jeffrey D. Blume 2019

机译：3145对机器学习和传统统计方法的评估以发现大规模翻译数据
7. Randomized Significance Tests in Machine Translation [O] . Yvette Graham, Nitika Mathur, Timothy Baldwin 2015

机译：机器翻译中的随机意义检验

Statistical Significance Tests for Machine Translation Evaluation

摘要

著录项

相似文献

相关主题

期刊订阅