首页> 外文期刊>International journal of computational linguistics and applications >BLEU Deconstructed: Designing a Better MT Evaluation Metric
【24h】

BLEU Deconstructed: Designing a Better MT Evaluation Metric

机译:解构的BLEU:设计更好的MT评估指标

获取原文
获取原文并翻译 | 示例
           

摘要

BLEU is the de facto standard automatic evaluation metric in machine translation. While BLEU is undeniably useful, it has a number of limitations. Although it works well for large documents and multiple references, it is unreliable at the sentence or sub-sentence levels, and with a single reference. In this paper, we propose new variants of BLEU which address these limitations, resulting in a more flexible metric which is not only more reliable, but also allows for more accurate discriminative training. Our best metric has better correlation with human judgements than standard BLEU, despite using a simpler formulation. Moreover, these improvements carry over to a system tuned for our new metric.
机译:BLEU是机器翻译中事实上的标准自动评估指标。尽管BLEU无疑是有用的,但它有许多局限性。尽管它适用于大型文档和多个参考文献,但在句子或子句级别以及只有一个参考文献时并不可靠。在本文中,我们提出了BLEU的新变体,这些变体解决了这些局限性,从而产生了一种更加灵活的指标,该指标不仅更可靠,而且可以进行更准确的判别训练。尽管使用了更简单的公式,但与标准BLEU相比,我们最好的指标与人类的判断具有更好的相关性。而且,这些改进可以延续到针对我们的新指标进行调整的系统中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号