首页> 外文会议>International Conference on Knowledge Science, Engineering and Management >A Study on Performance Sensitivity to Data Sparsity for Automated Essay Scoring
【24h】

A Study on Performance Sensitivity to Data Sparsity for Automated Essay Scoring

机译:自动论文评分数据稀疏性能敏感性研究

获取原文

摘要

Automated essay scoring (AES) attempts to rate essays automatically using machine learning and natural language processing techniques, hoping to dramatically reduce the manual efforts involved. Given a target prompt and a set of essays (for the target prompt) to rate, established AES algorithms are mostly prompt-dependent, thereby heavily relying on labeled essays for the particular target prompt as training data, making the availability and the completeness of the labeled essays essential for an AES model to perform. In aware of this, this paper sets out to investigate the impact of data sparsity on the effectiveness of several state-of-the-art AES models. Specifically, on the publicly available ASAP dataset, the effectiveness of different AES algorithms is compared relative to different levels of data completeness, which are simulated with random sampling. To this end, we show that the classical RankSVM and KNN models are more robust to the data sparsity, compared with the end-to-end deep neural network models, but the latter leads to better performance after being trained on sufficient data.
机译:自动化论文评分(AES)试图使用机器学习和自然语言处理技术自动评估散文,希望大大降低所涉及的手动努力。鉴于目标提示和一组论文(对于目标提示)来评估,所建立的AES算法主要依赖于依赖,从而严重依赖于标记为特定目标提示作为培训数据的散文,使得可用性和完整性标记为AES模型所必需的散文。在意识到这一点,本文阐述了数据稀疏对几个最先进的AES模型的有效性的影响。具体地,在公开可用的ASAP数据集上,不同AES算法的有效性相对于不同的数据完整程度,与随机采样模拟。为此,我们表明,与端到端深度神经网络模型相比,经典的RankSVM和KNN模型对数据稀疏性更加强大,但后者在足够的数据培训后导致更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号