首页> 外文会议>International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques >Improve Student Performance Prediction Using Ensemble Model for Higher Education
【24h】

Improve Student Performance Prediction Using Ensemble Model for Higher Education

机译:利用高等教育融合模型提高学生绩效预测

获取原文

摘要

In higher education institutions, the most significant issue is to improve the students' performance and retention rate. Massive numbers of students' data are used to gain new hidden knowledge from students' learning behaviour, particularly to discover the initial symptom of at-risk students by using Educational Data Mining techniques. However, data with noises, outliers and irrelevant information might cause an inaccurate result. This study aims to develop a robust students' performance prediction model for higher education institution by identifying features of students' data that have the potential to increase performance prediction results, comparing and identifying the most suitable ensemble learning technique after preprocessing the data and optimizing the hyperparameters. Data are collected from 2 different systems, which are: student information system and e-learning system of undergraduate students from the Faculty of Engineering in one of Malaysia's public university. 4413 students' instances are used for this study. The process follows 6 different data mining phases namely: data collection, data integration, data preprocessing (such as cleaning, normalization, and transformation), feature selection, patterns extraction and finally model optimization and evaluation. Machine learning techniques used to build prediction model are Decision Tree, Support Vector Machine and Artificial Neural Network, while for ensemble learning: Random Forest, Bagging, Stacking, Majority Vote and 2 variants of Boosting techniques are AdaBoost and XGBoost. Hyperparameters for ensemble learning techniques are optimized to gain better performance and optimum result. The result shows that the combination of features of students' behaviour from e-learning and students information system using Majority Vote produced better result compared to other ensemble methods.
机译:在高等教育机构中,最显著的问题是要提高学生的表现和保留率。学生的庞大的数字的数据被用来获得学生新的隐藏的知识的学习行为,特别是利用教育数据挖掘技术来发现高危学生的首发症状。然而,随着噪音,异常值和不相关的信息数据可能会导致不准确的结果。本研究旨在通过识别学生的特点对高等教育机构的性能预测模型有提高性能的预测结果的潜在数据,比较和预处理后的数据确定最合适的集成学习技术和优化的超参数制定了稳健的学生。学生信息系统,并在马来西亚的公立大学之一本科生来自工程学院的电子学习系统:数据来自两个不同的系统,这是收集。 4413个学生的情况下被用于这项研究。过程如下,即6个不同的数据挖掘阶段:数据采集,数据集成,数据预处理(如清洗,标准化和改造),特征选择,图案提取终于模型优化和评价。用于构建预测模型的机器学习技术是决策树,支持向量机和人工神经网络,而对于集成学习:随机森林,套袋,堆叠,多数表决和推进技术的2个变种的AdaBoost和XGBoost。对于集成学习技术的超参数进行优化,以获得更好的性能和最佳的结果。结果表明,使用多数表决从电子学习和学生信息系统的学生的行为特征的组合产生更好的结果相比其他集成方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号