【24h】

Automated Summary Scoring with ReaderBench

机译:使用ReaderBench自动汇总评分

获取原文

摘要

Text summarization is an effective reading comprehension strategy. However, summary evaluation is complex and must account for various factors including the summary and the reference text. This study examines a corpus of approximately 3,000 summaries based on 87 reference texts, with each summary being manually scored on a 4-point Likert scale. Machine learning models leveraging Natural Language Processing (NLP) techniques were trained to predict the extent to which summaries capture the main idea of the target text. The NLP models combined both domain and language independent textual complexity indices from the ReaderBench framework, as well as state-of-the-art language models and deep learning architectures to provide semantic contextualization. The models achieve low errors - normalized MAE ranging from 0.13-0.17 with corresponding R~2 values of up to 0.46. Our approach consistently outperforms baselines that use TF-IDF vectors and linear models, as well as Transfomer-based regression using BERT. These results indicate that NLP algorithms that combine linguistic and semantic indices are accurate and robust, while ensuring generalizability to a wide array of topics.
机译:文本摘要是一种有效的阅读理解策略。然而,总结评估是复杂的,必须考虑各种因素,包括总结和参考文本。本研究以87篇参考文献为基础,对约3000篇摘要进行了语料库分析,每一篇摘要都在利克特4分量表上手工打分。机器学习模型利用自然语言处理(NLP)技术进行训练,以预测摘要在多大程度上捕获了目标文本的主要思想。NLP模型结合了ReaderBench框架中与领域和语言无关的文本复杂性指数,以及最先进的语言模型和深度学习架构,以提供语义语境化。该模型实现了低误差——归一化MAE范围为0.13-0.17,相应的R~2值高达0.46。我们的方法始终优于使用TF-IDF向量和线性模型的基线,以及使用BERT的基于变换器的回归。这些结果表明,结合了语言和语义索引的NLP算法是准确和健壮的,同时确保了广泛主题的可推广性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号