The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study

Qiao YU; Shujuan JIANG; Yanmei ZHANG

首页> 外文期刊>IEICE transactions on information and systems >The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study

【24h】

The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study

机译：类不平衡的缺陷预测模型的性能稳定性：一项实证研究

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Class imbalance has drawn much attention of researchers in software defect prediction. In practice, the performance of defect prediction models may be affected by the class imbalance problem. In this paper, we present an approach to evaluating the performance stability of defect prediction models on imbalanced datasets. First, random sampling is applied to convert the original imbalanced dataset into a set of new datasets with different levels of imbalance ratio. Second, typical prediction models are selected to make predictions on these new constructed datasets, and Coefficient of Variation (C·V ) is used to evaluate the performance stability of different models. Finally, an empirical study is designed to evaluate the performance stability of six prediction models, which are widely used in software defect prediction. The results show that the performance of C4.5 is unstable on imbalanced datasets, and the performance of Naive Bayes and Random Forest are more stable than other models.

机译：类不平衡在软件缺陷预测中引起了研究人员的极大关注。实际上，缺陷预测模型的性能可能会受到类不平衡问题的影响。在本文中，我们提出了一种在不平衡数据集上评估缺陷预测模型的性能稳定性的方法。首先，应用随机抽样将原始不平衡数据集转换为一组具有不同不平衡率水平的新数据集。其次，选择典型的预测模型对这些新构建的数据集进行预测，然后使用变异系数（CiV）评估不同模型的性能稳定性。最后，设计了一项实证研究，以评估六个预测模型的性能稳定性，这六个模型在软件缺陷预测中得到了广泛使用。结果表明，C4.5在不平衡数据集上的性能不稳定，并且朴素贝叶斯和随机森林的性能比其他模型更稳定。

著录项

来源
《IEICE transactions on information and systems》 |2017年第2期|共8页
作者
Qiao YU; Shujuan JIANG; Yanmei ZHANG;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Cross-project defect prediction using data sampling for class imbalance learning: an empirical study [J] . Goel Lipika, Sharma Mayank, Khatri Sunil Kumar, International Journal of Parallel, Emergent and Distributed Systems . 2021,第1a2期

机译：使用类别不平衡学习数据采样的跨项目缺陷预测：实证研究
2. Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance [J] . International journal of applied mechanics . 2020,第3期

机译：类不平衡减少（CIR）：在类别不平衡存在下的软件缺陷预测的新方法
3. An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data [J] . Malhotra Ruchika, Kamal Shine Neurocomputing . 2019,第MAYa28期

机译：研究过采样方法以使用不平衡数据改善软件缺陷预测的实证研究
4. Implementation of Data Sampling in Class Imbalance Learning for Cross Project Defect Prediction: An Empirical Study [C] . Lipika Goel, Mayank Sharma, Sunil Kumar Khatri, International Symposium on Innovation in Information and Communication Technology . 2018

机译：跨项目缺陷预测的班级失衡学习中数据采样的实现：一项实证研究
5. An empirical study of the impact of experimental settings on defect classification models [D] . Ghotra, Baljinder. 2017

机译：实验设置对缺陷分类模型影响的实证研究
6. Empirical evidence of the impact of study characteristics on the performance of prediction models: a meta-epidemiological study [O] . Johanna A A G Damen, Thomas P A Debray, Romin Pajouheshnia, 2019

机译：研究特征对预测模型性能影响的经验证据：一项元流行病学研究
7. An Empirical Study on the Performance of Cost-Sensitive Boosting Algorithms with Different Levels of Class Imbalance [O] . Qing-Yan Yin, Jiang-She Zhang, Chun-Xia Zhang, 2013

机译：不同级别不平衡水平的成本敏感促进算法性能的实证研究

The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study

摘要

著录项

相似文献

相关主题

期刊订阅