首页> 外文期刊>Kinetik >Improving Automatic Essay Scoring for Indonesian Language using Simpler Model and Richer Feature
【24h】

Improving Automatic Essay Scoring for Indonesian Language using Simpler Model and Richer Feature

机译:使用简单模型和更丰富的功能改进印度尼西亚语言的自动论文评分

获取原文
           

摘要

Automatic essay scoring is a machine learning task where we create a model that can automatically assess student essay answers. Automated essay scoring will be instrumental when the answer assessment process is on a large scale so that manual correction by humans can cause several problems. In 2019, the Ukara dataset was released for automatic essay scoring in the Indonesian language. The best model that has been published using the dataset produces an F1-score of 0.821 using pre-trained fastText sentence embedding and the stacking model between the neural network and XGBoost. In this study, we propose to use a simpler classifier model using a single hidden layer neural network but using a richer feature, namely BERT sentence embedding. Pre-trained model BERT sentence embedding extracts more information from sentences but has a smaller file size than fastText pre-trained model. The best model we propose manages to get a higher F1-score than the previous models on the Ukara dataset, which is 0.829.
机译:自动论文评分是一种机器学习任务,我们创建了一个可以自动评估学生论文答案的型号。当答案评估过程大规模时,自动化的论文评分将是有助于的,以便人类的手动校正可能导致几个问题。 2019年,Ukara DataSet在印度尼西亚语言中自动发布了自动论文评分。使用DataSet发布的最佳模型,使用预先培训的FastText句子和神经网络与XGBoost之间的堆叠模型产生0.821的F1分数。在本研究中,我们建议使用单个隐藏层神经网络使用更简单的分类器模型,但使用更丰富的功能,即BERT句子嵌入。预先接受的模型BERT句嵌入从句子中提取更多信息,但具有比FastText预训练模型更小的文件大小。我们提出的最佳模型管理比UKARA数据集上的以前模型更高的F1分数,这是0.829。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号