MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility via semantic frames

机译：意思是：一种廉价，高精度，半自动度量，通过语义帧评估翻译实用程序

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce a novel semi-automated metric, MEANT, that assesses translation utility by matching semantic role fillers, producing scores that correlate with human judgment as well as HTER but at much lower labor cost. As machine translation systems improve in lexical choice and fluency, the shortcomings of widespread n-gram based, fluency-oriented MT evaluation metrics such as BLEU, which fail to properly evaluate adequacy, become more apparent. But more accurate, non-automatic adequacy-oriented MT evaluation metrics like HTER are highly labor-intensive, which bottlenecks the evaluation cycle. We first show that when using untrained monolingual readers to annotate semantic roles in MT output, the non-automatic version of the metric HMEANT achieves a 0.43 correlation coefficient with human adequacy judgments at the sentence level, far superior to BLEU at only 0.20, and equal to the far more expensive HTER. We then replace the human semantic role annotators with automatic shallow semantic parsing to further automate the evaluation metric, and show that even the semi-automated evaluation metric achieves a 0.34 correlation coefficient with human adequacy judgment, which is still about 80% as closely correlated as HTER despite an even lower labor cost for the evaluation procedure. The results show that our proposed metric is significantly better correlated with human judgment on adequacy than current widespread automatic evaluation metrics, while being much more cost effective than HTER.

机译：我们介绍了一种新型半自动度量，意思是，通过匹配语义角色填充物来评估翻译效用，产生与人类判断相关的分数以及HERT，但劳动力成本低得多。由于机器翻译系统在词汇选择和流畅性方面，基于普遍的N-GRAM的流畅性的MT评估指标等缺点，如BLEU，这未能妥善评估充分性，变得更加明显。但更准确，非自动充足的MT评估指标，如HETER是高度劳动密集型的，哪个瓶颈评估周期。我们首先表明，当使用未经训练的单晶读者注释在Mt输出中的语义作用时，公制Hmeant的非自动版本实现了0.43个相关系数，在句子水平上具有人类充足的判断，远远超过0.20，而且相等到目前为止更昂贵。然后，我们用自动浅语义解析替换人类语义角色注释器，以进一步自动化评估度量，表明即使半自动评估度量达到了0.34的相关系数，具有人为充分判断，仍然与之密切相关的仍然大约80％尽管甚至降低了评估程序的劳动力成本。结果表明，我们的拟议度量与人类判断有明显更好地相关，而不是当前广泛的自动评估指标，同时比HTER更具成本效益。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2011年||共10页
会议地点
作者
Chi-kiu Lo; Dekai Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Measuring machine translation quality as semantic equivalence: A metric based on entailment features [J] . Sebastian Pado, Daniel Cer, Michel Galley, Machine translation . 2009,第2a3期

机译：衡量机器翻译质量的语义对等度：一种基于包含特征的度量
2. Combining formal concept analysis and translation to assign frames and semantic role sets to French verbs [J] . Ingrid Falk, Claire Gardent Annals of Mathematics and Artificial Intelligence . 2014,第1a2期

机译：结合形式概念分析和翻译，为法语动词分配框架和语义角色集
3. Developing the Translational Research Workforce: A Pilot Study of Common Metrics for Evaluating the Clinical and Translational Award KL2 Program [J] . Schneider Margaret, Guerrero Lourdes, Jones Lisa B., Clinical and translational science. . 2015,第6期

机译：培养转化研究人员：评估临床和转化奖KL2计划的通用指标的初步研究
4. MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility via semantic frames [C] . Chi-kiu Lo, Dekai Wu Annual meeting of the Association for Computational Linguistics;ACL 2011 . 2012

机译：优点：一种廉价，高精度，半自动的度量标准，用于通过语义框架评估翻译效用
5. An investigation of the relationship between automated Machine Translation Evaluation metrics and user performance on an information extraction task. [D] . Tate, Calandra Rilette. 2007

机译：对自动机器翻译评估指标与信息提取任务上的用户性能之间的关系的调查。
6. Developing the Translational Research Workforce: A Pilot Study of Common Metrics for Evaluating the Clinical and Translational Award KL2 Program [O] . Margaret Schneider, Lourdes Guerrero, Lisa B. Jones, 2015

机译：培养转化研究人员：评估临床和转化奖KL2计划的通用指标的初步研究
7. Designing a Frame-Semantic Machine Translation Evaluation Metric [O] . Oliver Czulo, Tiago Timponi Torrent, Ely Edison da Silva Matos, 2019

机译：设计框架语义机器翻译评估度量

MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility via semantic frames

摘要

著录项

相似文献

相关主题

期刊订阅