首页> 外文会议>International conference on knowledge science, engineering and management >A Study on Performance Sensitivity to Data Sparsity for Automated Essay Scoring
【24h】

A Study on Performance Sensitivity to Data Sparsity for Automated Essay Scoring

机译:自动化论文评分对数据稀疏性的性能敏感性研究

获取原文

摘要

Automated essay scoring (AES) attempts to rate essays automatically using machine learning and natural language processing techniques, hoping to dramatically reduce the manual efforts involved. Given a target prompt and a set of essays (for the target prompt) to rate, established AES algorithms are mostly prompt-dependent, thereby heavily relying on labeled essays for the particular target prompt as training data, making the availability and the completeness of the labeled essays essential for an AES model to perform. In aware of this, this paper sets out to investigate the impact of data sparsity on the effectiveness of several state-of-the-art AES models. Specifically, on the publicly available ASAP dataset, the effectiveness of different AES algorithms is compared relative to different levels of data completeness, which are simulated with random sampling. To this end, we show that the classical RankSVM and KNN models are more robust to the data sparsity, compared with the end-to-end deep neural network models, but the latter leads to better performance after being trained on sufficient data.
机译:自动化论文评分(AES)尝试使用机器学习和自然语言处理技术对论文进行自动评分,以期显着减少所涉及的人工工作。给定目标提示和一组要评估的论文(针对目标提示),已建立的AES算法主要与提示相关,因此严重依赖于特定目标提示的标记论文作为训练数据,从而使目标提示的可用性和完整性成为可能。标记的论文对于AES模型的执行至关重要。有鉴于此,本文着手研究数据稀疏性对几种最新AES模型有效性的影响。具体来说,在可公开获得的ASAP数据集上,相对于数据完整性的不同级别,比较了不同AES算法的有效性,并使用随机抽样对其进行了仿真。为此,我们表明,与端到端的深度神经网络模型相比,经典的RankSVM和KNN模型对数据稀疏性更强健,但是在对足够的数据进行训练后,后者会带来更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号