Significance tests of automatic machine translation evaluation metrics

Ying Zhang; Stephan Vogel

首页> 外文期刊>Machine translation >Significance tests of automatic machine translation evaluation metrics

【24h】

Significance tests of automatic machine translation evaluation metrics

机译：自动机器翻译评估指标的意义测试

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic evaluation metrics for Machine Translation (MT) systems, such as BLEU, METEOR and the related NIST metric, are becoming increasingly important in MT research and development. This paper presents a significance test-driven comparison of n-gram-based automatic MT evaluation metrics. Statistical significance tests use bootstrapping methods to estimate the reliability of automatic machine translation evaluations. Based on this reliability estimation, we study the characteristics of different MT evaluation metrics and how to construct reliable and efficient evaluation suites.

机译：机器翻译（MT）系统的自动评估指标，例如BLEU，METEOR和相关的NIST指标，在MT研发中变得越来越重要。本文介绍了基于n-gram的自动MT评估指标的显着性测试驱动比较。统计显着性测试使用自举方法来估计自动机器翻译评估的可靠性。基于此可靠性评估，我们研究了不同MT评估指标的特征以及如何构建可靠而有效的评估套件。

著录项

来源
《Machine translation》 |2010年第1期|P.51-65|共15页
作者
Ying Zhang; Stephan Vogel;
展开▼
作者单位

Carnegie Mellon University, Pittsburgh, PA, USA;

rnCarnegie Mellon University, Pittsburgh, PA, USA;

展开▼
收录信息美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
machine translation evaluation; significance test; bootstrap; confidence interval; evaluation suite construction;

机译：机器翻译评估;显着性检验引导程序置信区间评估套件的构建;
入库时间 2022-08-18 00:39:48

相似文献

外文文献
中文文献
专利

1. STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings [J] . Li Pairui, Chen Chuan, Zheng Wujie, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第10期

机译：STD：基于词嵌入的机器翻译自动评估指标
2. Detecting errors in machine translation using residuals and metrics of automatic evaluation [J] . Munk Michal, Munkova Dasa Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2018,第5期

机译：使用残差和自动评估机器的机器翻译中的错误
3. The Meteor metric for automatic evaluation of machine translation [J] . Alon Lavie, Michael J. Denkowski Machine translation . 2009,第2a3期

机译：自动评估机器翻译的Meteor度量
4. Statistical Significance Tests for Machine Translation Evaluation [C] . Philipp Koehn ;42nd Annual Meeting of the Association for Computational Linguistics . 2004

机译：机器翻译评估的统计显着性检验
5. An investigation of the relationship between automated Machine Translation Evaluation metrics and user performance on an information extraction task. [D] . Tate, Calandra Rilette. 2007

机译：对自动机器翻译评估指标与信息提取任务上的用户性能之间的关系的调查。
6. Quantitative microbiological monitoring of hemodialysis fluids: evaluation of methods and demonstration of lack of test relevance in single-pass hemodialysis machines with automatic dialysate proportioning with reverse osmosis-treated tap water. [O] . G V Doern, B E Brogden, J D DiFederico, 1982

机译：血液透析液的微生物定量监测：方法的评估和缺乏自动透析液配以反渗透处理过的自来水的单程血液透析机缺乏测试相关性的证明。
7. Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics [O] . Nitika Mathur, Timothy Baldwin, Trevor Cohn 2020

机译：在Bleu纠结：重新评估自动机器翻译评估度量的评估

Significance tests of automatic machine translation evaluation metrics

摘要

著录项

相似文献

相关主题

期刊订阅