【24h】

Summarization Evaluation meets Short-Answer Grading

机译:总结评估满足短期评分

获取原文

摘要

Summarization Evaluation and Short-Answer Grading share the challenge of automatically evaluating content quality. Therefore, we explore the use of ROUGE, a well-known Summarization Evaluation method, for Short-Answer Grading. We find a reliable ROUGE parametrization that is robust across corpora and languages and produces scores that are significantly cor-related with human short-answer grades. ROUGE adds no information to Short-Answer Grading NLP-based machine learn-ing features in a by-corpus evaluation. However, on a question-by-question basis, we find that the ROUGE Recall score may outperform standard NLP features. We therefore suggest to use ROUGE within a framework for per-question feature se-lection or as a reliable and reproducible baseline for SAG.
机译:汇总评估和短期评分面临自动评估内容质量的挑战。因此,我们探索将ROUGE(一种众所周知的汇总评估方法)用于短期答案分级。我们找到了可靠的ROUGE参数化方​​法,该参数化方法在语料库和语言之间均非常可靠,并且产生的分数与人类简短答案等级显着相关。在实体评估中,ROUGE不会向基于NLP的短答案分级的机器学习功能添加任何信息。但是,在逐个问题的基础上,我们发现ROUGE Recall得分可能优于标准的NLP功能。因此,我们建议在针对每个问题的特征选择框架内使用ROUGE,或将其用作SAG的可靠且可重现的基准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号