首页> 外文会议>International Conference on Informatics, Multimedia, Cyber and Information System >Tackling Feature Selection Problems with Genetic Algorithms in Software Defect Prediction for Optimization
【24h】

Tackling Feature Selection Problems with Genetic Algorithms in Software Defect Prediction for Optimization

机译:解决软件缺陷预测中遗传算法的特征选择问题

获取原文

摘要

Software defect prediction is a way to improve quality by finding and tracking defective modules in the software which helps reduce costs during the software testing process. The use of machine learning methods for predicting software defects can be applied to predict defects in each software module. However, basically the software defect prediction dataset has two problems, namely class imbalance with very few defective modules compared to non-defective modules and contains noisy attributes due to irrelevant features. With these two problems, it will result in overfitting and lead to biased classification results so that it will have an impact on significantly reducing the performance of the machine learning model. In this study, we propose the implementation of bagging techniques and genetic algorithms to improve the classification performance of machine learning models in predicting software defects based Logistic Regression, Naive Bayes, SVM, KNN, Decision Tree. Bagging techniques and Genetic algorithms are approaches that can handle two main problems in software defects prediction, each of which can handle the class imbalance and feature selection problem. We used 6 NASA Promise datasets to evaluate the classification performance results based on AUC and G-Means values. The results using 10 cross-validations show that the proposed method can improve classification performance when compared to the original algorithm. The Decision Tree shows the highest performance of the 3 datasets tested, with the highest value of 94.61 % on the KC4 dataset. We also compare GA performance with another natural algorithm, Particle Swarm Optimization (PSO). The results show that the performance of all machine learning models with GA can outperform the algorithms with PSO
机译:软件缺陷预测是通过在软件中查找和跟踪缺陷模块来提高质量的方法,这有助于降低软件测试过程中的成本。用于预测软件缺陷的机器学习方法的使用可以应用于预测每个软件模块中的缺陷。然而,基本上,软件缺陷预测数据集具有两个问题,即与非缺陷模块相比,具有极少缺陷模块的类别不平衡,并且由于不相关的功能而包含嘈杂的属性。通过这两个问题,它将导致过度装备并导致偏置分类结果,以便对机器学习模型的性能显着降低,这将产生影响。在这项研究中,我们提出了实施装袋技术和遗传算法,以提高机器学习模型的分类性能,以预测基于软件缺陷的逻辑回归,天真贝叶斯,SVM,KNN,决策树。装订技术和遗传算法是可以处理软件缺陷预测中的两个主要问题的方法,每个方法可以处理类别不平衡和特征选择问题。我们使用了6个NASA Promise数据集来评估基于AUC和G均值的分类性能结果。使用10个交叉验证的结果表明,与原始算法相比,所提出的方法可以提高分类性能。决策树显示了测试的3个数据集的最高性能,在KC4数据集中的最高值为94.61%。我们还使用另一种自然算法,粒子群优化(PSO)进行比较GA性能。结果表明,与GA的所有机器学习模型的性能都可以与PSO的算法优于差异

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号