Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

Abdullateef Oluwagbemiga Balogun; Shuib Basri; Said Jadid Abdulkadir; Ahmad Sobri Hashim

首页> 外文期刊>Applied Sciences >Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

【24h】

Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

机译：软件缺陷预测中特征选择方法的性能分析：一种搜索方法方法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Software Defect Prediction (SDP) models are built using software metrics derived from software systems. The quality of SDP models depends largely on the quality of software metrics (dataset) used to build the SDP models. High dimensionality is one of the data quality problems that affect the performance of SDP models. Feature selection (FS) is a proven method for addressing the dimensionality problem. However, the choice of FS method for SDP is still a problem, as most of the empirical studies on FS methods for SDP produce contradictory and inconsistent quality outcomes. Those FS methods behave differently due to different underlining computational characteristics. This could be due to the choices of search methods used in FS because the impact of FS depends on the choice of search method. It is hence imperative to comparatively analyze the FS methods performance based on different search methods in SDP. In this paper, four filter feature ranking (FFR) and fourteen filter feature subset selection (FSS) methods were evaluated using four different classifiers over five software defect datasets obtained from the National Aeronautics and Space Administration (NASA) repository. The experimental analysis showed that the application of FS improves the predictive performance of classifiers and the performance of FS methods can vary across datasets and classifiers. In the FFR methods, Information Gain demonstrated the greatest improvements in the performance of the prediction models. In FSS methods, Consistency Feature Subset Selection based on Best First Search had the best influence on the prediction models. However, prediction models based on FFR proved to be more stable than those based on FSS methods. Hence, we conclude that FS methods improve the performance of SDP models, and that there is no single best FS method, as their performance varied according to datasets and the choice of the prediction model. However, we recommend the use of FFR methods as the prediction models based on FFR are more stable in terms of predictive performance.

机译：软件缺陷预测（SDP）模型使用来自软件系统的软件指标构建。 SDP模型的质量在很大程度上取决于用于构建SDP模型的软件度量标准（数据集）的质量。高维度是影响SDP模型性能的数据质量问题之一。特征选择（FS）是一种解决维度问题的经过验证的方法。然而，FS方法对于SDP的选择仍然是一个问题，因为大多数对SDP的FS方法的大多数实证研究产生了矛盾和不一致的质量结果。由于不同的下划线计算特征，那些FS方法的行为不同。这可能是由于FS中使用的搜索方法的选择，因为FS的影响取决于搜索方法的选择。因此，必须基于SDP中的不同搜索方法进行比较分析FS方法性能。在本文中，使用四种不同的分类器在从美国国家航空和空间管理（NASA）存储库中获得的五个软件缺陷数据集中，评估了四个过滤器特征排名（FFR）和十四滤波器特征子集选择（FFS）方法。实验分析表明，FS的应用提高了分类器的预测性能，FS方法的性能可以各不相差地变跨数据集和分类器。在FFR方法中，信息增益显示了预测模型性能的最大改进。在FSS方法中，基于最佳第一搜索的一致性特征子集选择对预测模型具有最佳影响。然而，基于FFR的预测模型被证明比基于FSS方法更稳定。因此，我们得出结论，FS方法改善了SDP模型的性能，并且没有单一最佳的FS方法，因为它们的性能根据数据集和预测模型的选择而变化。但是，我们建议使用FFR方法作为基于FFR的预测模型在预测性能方面更稳定。

著录项

来源
《Applied Sciences》 |2019年第13期|共20页
作者
Abdullateef Oluwagbemiga Balogun; Shuib Basri; Said Jadid Abdulkadir; Ahmad Sobri Hashim;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
software defect predictionfeature selectionhigh dimensionalitysearch methods;

机译：软件缺陷预测特征选择高维度研究方法;

相似文献

外文文献
中文文献
专利

1. A Feature Selection Approach based on Grey Relational Analysis for Within-project Software Defect Prediction [J] . Yao Zhexi, Zhang Tao, Wang Jinbo, The Journal of grey system . 2019,第3期

机译：基于灰色关系分析的特征选择方法，用于项目内部软件缺陷预测
2. A Novel Feature Selection Method Based on Maximum Likelihood Logistic Regression for Imbalanced Learning in Software Defect Prediction [J] . Bashir Kamal, Li Tianrui, Yahaya Mahama The international arab journal of information technology . 2020,第5期

机译：一种新颖的特征选择方法，基于最大似然逻辑回归软件缺陷预测中的不平衡学习
3. A Cluster Based Feature Selection Method for Cross-Project Software Defect Prediction [J] . Chao Ni, Wang-Shu Liu, Xiang Chen, 计算机科学技术学报（英文版） . 2017,第006期

机译：跨项目软件缺陷预测的基于聚类的特征选择方法
4. Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis [C] . Abdullateef O. Balogun, Shuib Basri, Said A. Jadid, Computer Science On-line Conference . 2020

机译：软件缺陷预测中的搜索包装器特征选择方法：实证分析
5. Prediction of CYP3A4 Metabolic Activity from Whole Genome RNA-Seq Data with Feature Selection Machine Learning Methods [D] . Jia, Yichen. 2017

机译：特征选择机器学习方法从全基因组RNA-Seq数据预测CYP3A4代谢活性
6. Performance comparison of linear and non-linear feature selection methods for the analysis of large survey datasets [O] . Olga Krakovska, Gregory Christie, Andrew Sixsmith, -1

机译：线性和非线性特征选择方法在大型调查数据集分析中的性能比较
7. Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach [O] . Abdullateef Oluwagbemiga Balogun, Shuib Basri, Said Jadid Abdulkadir, 2019

机译：软件缺陷预测中特征选择方法的性能分析：一种搜索方法方法

Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

摘要

著录项

相似文献

相关主题

期刊订阅