首页> 外文期刊>International Journal of Parallel, Emergent and Distributed Systems >Cross-project defect prediction using data sampling for class imbalance learning: an empirical study
【24h】

Cross-project defect prediction using data sampling for class imbalance learning: an empirical study

机译:使用类别不平衡学习数据采样的跨项目缺陷预测:实证研究

获取原文
获取原文并翻译 | 示例

摘要

The presence of defect data related to different projects leads to cross project defect prediction an open issue in the field of research in software engineering. In cross-project defect prediction, the source and the target projects are different. The prediction model is trained by using the data sources of the different projects and then it is tested on the target data source. The data source from the varying projects leads to a highly imbalanced source dataset. The performance of the predictive model degrades due to this imbalance nature of the dataset. This is termed as the class imbalance problem in machine learning. This paper conducts an empirical analysis in a bi-fold manner. It evaluates whether data sampling techniques can handle the class imbalance problem and improve the performance of the predictive model for cross-project defect prediction (CPDP). Secondly, it also evaluates whether the results of CPDP after data sampling are comparable to within project defect prediction (WPDP). Ensemble learn ing classifiers are used as the predictive model over 12 publically available object-oriented project datasets. The experimental results infer that SMOTE oversampling can be applied to overcome the problem of class imbalance on CPDP. It also gives comparable results to WPDP with statistical significance.[GRAPHICS].
机译:与不同项目相关的缺陷数据的存在导致在软件工程研究领域的开放问题交叉项目缺陷预测。在跨项目缺陷预测中,源和目标项目不同。通过使用不同项目的数据源进行预测模型,然后在目标数据源上进行测试。来自不同项目的数据源导致高度不平衡的源数据集。由于数据集的这种不平衡性质,预测模型的性能降低了。这被称为机器学习中的类别不平衡问题。本文以双倍方式进行经验分析。它评估数据采样技术是否可以处理类别不平衡问题,并提高跨项目缺陷预测(CPDP)的预测模型的性能。其次,它还评估数据采样后CPDP的结果是否在项目缺陷预测(WPDP)中相当。 Ensemble学习分类器被用作超过12个出色的面向对象的项目数据集的预测模型。可以应用窒息过采样的实验结果来克服CPDP类别不平衡问题。它还使WPDP具有统计显着性的可比较结果。[图形]。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号