首页> 外文OA文献 >Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile
【2h】

Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile

机译:基于数据挖掘的高等教育学生保留知识发现:智利机器学习算法和案例研究

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Data mining is employed to extract useful information and to detect patterns from often large data sets, closely related to knowledge discovery in databases and data science. In this investigation, we formulate models based on machine learning algorithms to extract relevant information predicting student retention at various levels, using higher education data and specifying the relevant variables involved in the modeling. Then, we utilize this information to help the process of knowledge discovery. We predict student retention at each of three levels during their first, second, and third years of study, obtaining models with an accuracy that exceeds 80% in all scenarios. These models allow us to adequately predict the level when dropout occurs. Among the machine learning algorithms used in this work are: decision trees, k-nearest neighbors, logistic regression, naive Bayes, random forest, and support vector machines, of which the random forest technique performs the best. We detect that secondary educational score and the community poverty index are important predictive variables, which have not been previously reported in educational studies of this type. The dropout assessment at various levels reported here is valid for higher education institutions around the world with similar conditions to the Chilean case, where dropout rates affect the efficiency of such institutions. Having the ability to predict dropout based on student’s data enables these institutions to take preventative measures, avoiding the dropouts. In the case study, balancing the majority and minority classes improves the performance of the algorithms.
机译:数据挖掘是用来提取有用的信息,并从常大的数据集,密切相关的数据库和数据的科学知识发现检测模式。在本次调查中,我们制定基于机器学习算法来提取相关信息的各级学生预测保留,利用高等教育的数据,并指定参与建模相关的变量模型。然后,我们利用这些信息来帮助知识发现的过程。我们预测在他们的第一,第二和第三年的研究,在各三个层次的学生保留,获得机型超过80%在所有情况下的精确度。这些模型允许压差时发生我们充分预测的水平。在这项工作中使用的机器学习算法是:决策树,k-最近邻居,logistic回归,朴素贝叶斯,随机森林和支持向量机,其中随机森林技术表现最好的。我们检测到中等教育得分和社会贫困指数是重要的预测变量,这还没有这类教育的研究已有报道。各级差评这里报告的有效期为世界各地的高等教育机构有相似的条件智利的情况下,如果辍学率影响这些机构的效率。有预测辍学根据学生的数据的能力,使这些机构采取预防措施,避免了辍学。在案例研究中,平衡多数和少数类改进的算法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号