Applying Novel Resampling Strategies To Software Defect Prediction

机译：将新颖的重采样策略应用于软件缺陷预测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Due to the tremendous complexity and sophistication of software, improving software reliability is an enormously difficult task. We study the software defect prediction problem, which focuses on predicting which modules will experience a failure during operation. Numerous studies have applied machine learning to software defect prediction; however, skewness in defect-prediction datasets usually undermines the learning algorithms. The resulting classifiers will often never predict the faulty (minority0 class. This problem is well known in machine learning and is often referred to as learning from imbalanced datasets. We examine stratification, a widely used technique for learning imbalanced data that has received little attention in software defect prediction. Our experiments are focused on the SMOTE technique, which is a method of over-sampling minority-class examples. Our goal is to determine if SMOTE can improve recognition of defect-prone modules, and at what cost. Our experiments demonstrate that after SMOTE resampling, we have a more balanced classification. We found an improvement of at least 23% in the average geometric mean classification accuracy on four benchmark datasets.

机译：由于软件的高度复杂性和复杂性，提高软件可靠性是一项极为困难的任务。我们研究软件缺陷预测问题，其重点是预测操作过程中哪些模块将发生故障。许多研究已经将机器学习应用于软件缺陷预测。但是，缺陷预测数据集中的偏斜通常会破坏学习算法。结果分类器将永远无法预测故障（minority0类）。该问题在机器学习中众所周知，通常被称为从不平衡数据集学习。软件缺陷预测：我们的实验集中在SMOTE技术上，SMOTE技术是对少数族裔示例进行过度采样的一种方法，我们的目标是确定SMOTE是否可以提高易发缺陷模块的识别能力，以及降低成本的方法。在进行SMOTE重采样后，我们得到了更加平衡的分类，发现四个基准数据集的平均几何平均分类精度至少提高了23％。

著录项

来源
《》|2007年|69-72|共4页
会议地点
作者
Pelayo; Lourdes; Dick; Scott;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Analyzing defect inflow distribution and applying Bayesian inference method for software defect prediction in large software projects [J] . Rakesh Rana, Miroslaw Staron, Christian Berger, The Journal of Systems and Software . 2016 ,第jula期

机译：分析缺陷流入分布并将贝叶斯推理方法应用于大型软件项目中的软件缺陷预测
2. Performance Analysis of Resampling Techniques on Class Imbalance Issue in Software Defect Prediction [J] . Ahmed Iqbal, Shabib Aftab, Faseeha Matloob International Journal of Information Technology and Computer Science . 2019 ,第11期

机译：软件缺陷预测中类不平衡问题重采样技术的性能分析
3. On the relative value of data resampling approaches for software defect prediction [J] . Bennin Kwabena Ebo, Keung Jacky W., Monden Akito Empirical Software Engineering . 2019 ,第2期

机译：数据重采样方法在软件缺陷预测中的相对价值
4. Applying Novel Resampling Strategies To Software Defect Prediction [C] . Lourdes Pelayo, Scott Dick Annual Meeting of the North American Fuzzy Information Processing Society . 2007

机译：应用新型重采样策略到软件缺陷预测
5. Applying FAHP to Improve the Performance Evaluation Reliability and Validity of Software Defect Classifiers [D] . Ghunaim, Hussam. 2019

机译：应用FAHP改进软件缺陷分类器的性能评估可靠性和有效性
6. Software Defect Prediction for Healthcare Big Data: An Empirical Evaluation of Machine Learning Techniques [O] . Bilal Khan, Rashid Naseem, Muhammad Arif Shah, 2021

机译：医疗保健大数据的软件缺陷预测：机器学习技术的实证评价
7. Prediction of Blood-Brain Barrier Permeability of Compounds by Fusing Resampling Strategies and eXtreme Gradient Boosting [O] . Zhiwen Shi, Yanyi Chu, Yonghong Zhang, 2021

机译：通过融合重新采样策略和极端梯度升压预测化合物的血脑屏障渗透性

Applying Novel Resampling Strategies To Software Defect Prediction

摘要

著录项

相似文献

相关主题

期刊订阅