首页> 外文会议>International conference on advanced data mining and applications >Addressing Class Imbalance and Cost Sensitivity in Software Defect Prediction by Combining Domain Costs and Balancing Costs
【24h】

Addressing Class Imbalance and Cost Sensitivity in Software Defect Prediction by Combining Domain Costs and Balancing Costs

机译:通过结合域成本和平衡成本来解决软件缺陷预测中的类不平衡和成本敏感性

获取原文

摘要

Effective methods for identification of software defects help minimize the business costs of software development. Classification methods can be used to perform software defect prediction. When cost-sensitive methods are used, the predictions are optimized for business cost. The data sets used as input for these methods typically suffer from the class imbalance problem. That is, there are many more defect-free code examples than defective code examples to learn from. This negatively impacts the classifier's ability to correctly predict defective code examples. Cost-sensitive classification can also be used to mitigate the affects of the class imbalance problem by setting the costs to reflect the level of imbalance in the training data set. Through an experimental process, we have developed a method for combining these two different types of costs. We demonstrate that by using our proposed approach, we can produce more cost effective predictions than several recent cost-sensitive methods used for software defect prediction. Furthermore, we examine the software defect prediction models built by our method and present the discovered insights.
机译:识别软件缺陷的有效方法有助于最大程度地降低软件开发的业务成本。分类方法可用于执行软件缺陷预测。使用成本敏感的方法时,将针对业务成本优化预测。用作这些方法的输入的数据集通常会遇到类不平衡问题。也就是说,要学习的缺陷代码示例要比缺陷代码示例多得多。这会对分类器正确预测有缺陷的代码示例的能力产生负面影响。成本敏感的分类还可用于通过设置成本以反映训练数据集中的失衡程度来减轻班级失衡问题的影响。通过实验过程,我们开发了一种将这两种不同类型的成本相结合的方法。我们证明,通过使用我们提出的方法,我们可以比用于软件缺陷预测的几种最近的对成本敏感的方法产生更具成本效益的预测。此外,我们检查了通过我们的方法构建的软件缺陷预测模型,并提出了发现的见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号