Tackling Feature Selection Problems with Genetic Algorithms in Software Defect Prediction for Optimization

机译：解决软件缺陷预测中遗传算法的特征选择问题

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Software defect prediction is a way to improve quality by finding and tracking defective modules in the software which helps reduce costs during the software testing process. The use of machine learning methods for predicting software defects can be applied to predict defects in each software module. However, basically the software defect prediction dataset has two problems, namely class imbalance with very few defective modules compared to non-defective modules and contains noisy attributes due to irrelevant features. With these two problems, it will result in overfitting and lead to biased classification results so that it will have an impact on significantly reducing the performance of the machine learning model. In this study, we propose the implementation of bagging techniques and genetic algorithms to improve the classification performance of machine learning models in predicting software defects based Logistic Regression, Naive Bayes, SVM, KNN, Decision Tree. Bagging techniques and Genetic algorithms are approaches that can handle two main problems in software defects prediction, each of which can handle the class imbalance and feature selection problem. We used 6 NASA Promise datasets to evaluate the classification performance results based on AUC and G-Means values. The results using 10 cross-validations show that the proposed method can improve classification performance when compared to the original algorithm. The Decision Tree shows the highest performance of the 3 datasets tested, with the highest value of 94.61 % on the KC4 dataset. We also compare GA performance with another natural algorithm, Particle Swarm Optimization (PSO). The results show that the performance of all machine learning models with GA can outperform the algorithms with PSO

机译：软件缺陷预测是通过在软件中查找和跟踪缺陷模块来提高质量的方法，这有助于降低软件测试过程中的成本。用于预测软件缺陷的机器学习方法的使用可以应用于预测每个软件模块中的缺陷。然而，基本上，软件缺陷预测数据集具有两个问题，即与非缺陷模块相比，具有极少缺陷模块的类别不平衡，并且由于不相关的功能而包含嘈杂的属性。通过这两个问题，它将导致过度装备并导致偏置分类结果，以便对机器学习模型的性能显着降低，这将产生影响。在这项研究中，我们提出了实施装袋技术和遗传算法，以提高机器学习模型的分类性能，以预测基于软件缺陷的逻辑回归，天真贝叶斯，SVM，KNN，决策树。装订技术和遗传算法是可以处理软件缺陷预测中的两个主要问题的方法，每个方法可以处理类别不平衡和特征选择问题。我们使用了6个NASA Promise数据集来评估基于AUC和G均值的分类性能结果。使用10个交叉验证的结果表明，与原始算法相比，所提出的方法可以提高分类性能。决策树显示了测试的3个数据集的最高性能，在KC4数据集中的最高值为94.61％。我们还使用另一种自然算法，粒子群优化（PSO）进行比较GA性能。结果表明，与GA的所有机器学习模型的性能都可以与PSO的算法优于差异

著录项

来源
《International Conference on Informatics, Multimedia, Cyber and Information System》|2020年|64-69|共6页
会议地点
作者
Rizal Broer Bahaweres; Arif Imam Suroso; Alam Wahyu Hutomo; Indra Permana Solihin; Irman Hermadi; Yandra Arkeman;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Machine learning algorithms; Machine learning; Predictive models; Prediction algorithms; Software; Classification algorithms; Genetic algorithms;

机译：机器学习算法;机器学习;预测模型;预测算法;软件;分类算法;遗传算法;

相似文献

外文文献
中文文献
专利

1. An Approach for Optimized Feature Selection in Software Product Lines using Union-Find and Genetic Algorithms [J] . Asad Abbas, Zhiqiang Wu, Isma Farah Siddiqui, Indian Journal of Science and Technology . 2016,第17期

机译：基于联合发现和遗传算法的软件产品线特征选择优化方法
2. An Approach for Optimized Feature Selection in Software Product Lines using Union-Find and Genetic Algorithms [J] . Asad Abbas, Zhiqiang Wu, Isma Farah Siddiqui, Indian Journal of Science and Technology . 2016,第17期

机译：基于联合发现和遗传算法的软件产品线特征选择优化方法
3. RFC: A feature selection algorithm for software defect prediction [J] . Xu Xiaolong, Chen Wen, Wang Xinheng Systems Engineering and Electronics, Journal of . 2021,第2期

机译：RFC：软件缺陷预测的特征选择算法
4. Software Defect Prediction using Feature Selection and Random Forest Algorithm [C] . Dyana Rashid Ibrahim, Rawan Ghnemat, Amjad Hudaib International Conference on New Trends in Computing Sciences . 2017

机译：基于特征选择和随机森林算法的软件缺陷预测
5. Genetic algorithm optimized feature extraction and selection for ECG pattern classification. [D] . Huang, Zhijian. 2002

机译：遗传算法优化了心电图模式分类的特征提取和选择。
6. Cost-Constrained feature selection in binary classification: adaptations for greedy forward selection and genetic algorithms [O] . Rudolf Jagdhuber, Michel Lang, Arnulf Stenzl, 2020

机译：二元分类中受成本约束的特征选择：贪婪前向选择和遗传算法的改编
7. A Novel Feature Subset Selection Algorithm for Software Defect Prediction [O] . Reena P, Binu Rajan 2015

机译：一种新的软件缺陷预测特征子集选择算法
8. Data Mining Feature Subset Weighting and Selection Using Genetic Algorithms [R] . 2002

机译：基于遗传算法的数据挖掘特征子集加权和选择

Tackling Feature Selection Problems with Genetic Algorithms in Software Defect Prediction for Optimization

摘要

著录项

相似文献

相关主题

期刊订阅