首页> 外文期刊>IEICE transactions on information and systems >The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study
【24h】

The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study

机译:类不平衡的缺陷预测模型的性能稳定性:一项实证研究

获取原文
           

摘要

Class imbalance has drawn much attention of researchers in software defect prediction. In practice, the performance of defect prediction models may be affected by the class imbalance problem. In this paper, we present an approach to evaluating the performance stability of defect prediction models on imbalanced datasets. First, random sampling is applied to convert the original imbalanced dataset into a set of new datasets with different levels of imbalance ratio. Second, typical prediction models are selected to make predictions on these new constructed datasets, and Coefficient of Variation (C·V ) is used to evaluate the performance stability of different models. Finally, an empirical study is designed to evaluate the performance stability of six prediction models, which are widely used in software defect prediction. The results show that the performance of C4.5 is unstable on imbalanced datasets, and the performance of Naive Bayes and Random Forest are more stable than other models.
机译:类不平衡在软件缺陷预测中引起了研究人员的极大关注。实际上,缺陷预测模型的性能可能会受到类不平衡问题的影响。在本文中,我们提出了一种在不平衡数据集上评估缺陷预测模型的性能稳定性的方法。首先,应用随机抽样将原始不平衡数据集转换为一组具有不同不平衡率水平的新数据集。其次,选择典型的预测模型对这些新构建的数据集进行预测,然后使用变异系数(CiV)评估不同模型的性能稳定性。最后,设计了一项实证研究,以评估六个预测模型的性能稳定性,这六个模型在软件缺陷预测中得到了广泛使用。结果表明,C4.5在不平衡数据集上的性能不稳定,并且朴素贝叶斯和随机森林的性能比其他模型更稳定。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号