【24h】

Ensemble Learning on Scoring Student Essay

机译:在评分学生论文中的合奏学习

获取原文

摘要

Automated essay scoring is becoming more and more concerned by the researchers. In this work, we develop a new way to extract Textual features, which is proved to be valid. First, we calculate the Distributed Representation from the WiKi corpus by the word2vec. Then we calculate the number of words, the number of dictionary, the diversity of words as the textual features by K-means and Distributed Representation. There will be 3*k textual features as the k represents the number of categories. Besides, we calculate the structure features including the length of essay, the number of paragraph, the length of sentence etc. We use several models such as XGBoost, Random Forest, GBDT to train the training set and predict the test set. Finally, We ensemble the prediction of those models as the final prediction.
机译:自动化论文评分越来越多地关注研究人员。在这项工作中,我们开发了一种提取文本功能的新方法,被证明有效。首先,我们通过Word2VEC计算来自Wiki语料库的分布式表示。然后,我们计算单词数,字典数量,单词的分集作为k-means和分布式表示的文本特征。当K表示类别的数量,将有3 * k的文本功能。此外,我们计算包括文章长度的结构特征,段落数量,句子的长度等。我们使用XGBoost,随机森林等多种型号,以培训训练集并预测测试集。最后,我们集合了对这些模型的预测作为最终预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号