首页> 外文期刊>Software Testing, Verification and Reliability >CSSG: A cost-sensitive stacked generalization approach for software defect prediction
【24h】

CSSG: A cost-sensitive stacked generalization approach for software defect prediction

机译:CSSG:用于软件缺陷预测的成本敏感的综合概括方法

获取原文
获取原文并翻译 | 示例

摘要

The prediction of software artifacts on defect-prone (DP) or non-defect-prone (NDP) classes during the testing phase helps minimize software business costs, which is a classification task in software defect prediction (SDP) field. Machine learning methods are helpful for the task, although they face the challenge of data imbalance distribution. The challenge leads to serious misclassification of artifacts, which will disrupt the predictor's performance. The previously developed stacking ensemble methods do not consider the cost issue to handle the class imbalance problem (CIP) over the training dataset in the SDP field. To bridge this research gap, in the cost-sensitive stacked generalization (CSSG) approach, we try to combine the staking ensemble learning method with cost-sensitive learning (CSL) since the CSL purpose is to reduce misclassification costs. In the cost-sensitive stacked generalization (CSSG) approach, logistic regression (LR) and extremely randomized trees classifiers in cases of CSL and cost-insensitive are used as a final classifier of stacking scheme. To evaluate the performance of CSSG, we use six performance measures. Several experiments are carried out to compare the CSSG with some cost-sensitive ensemble methods on 15 benchmark datasets with different imbalance levels. The results indicate that the CSSG can be an effective solution to the CIP than other compared methods.
机译:在测试阶段期间的缺陷易于(DP)或非缺陷易于(NDP)类的软件伪影的预测有助于最小化软件业务成本,这是软件缺陷预测(SDP)字段中的分类任务。机器学习方法有助于任务,尽管它们面临数据不平衡分布的挑战。挑战导致严重错误分类,这将扰乱预测因素的表现。先前开发的堆叠集合方法不考虑在SDP字段中处理训练数据集的类别不平衡问题(CIP)。为了弥合这一研究缺口,在成本敏感的堆叠概括(CSSG)方法中,我们尝试将绑定集合学习方法与成本敏感的学习(CSL)结合起来,因为CSL目的是降低错误分类成本。在成本敏感的堆叠概括(CSSG)方法中,CSL和成本不敏感情况下的逻辑回归(LR)和极其随机树木分类器用作堆叠方案的最终分类器。为了评估CSSG的性能,我们使用六种性能措施。进行了几个实验以将CSSG与一些具有不同不平衡水平的基准数据集中的一些成本敏感的集合方法进行比较。结果表明,CSSG可以是CIP的有效解决方案而不是其他比较方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号