Handling Imbalanced Data using Ensemble Learning in Software Defect Prediction

机译：在软件缺陷预测中使用集成学习处理不平衡数据

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the ever growing software industry, software defect prediction is one of the key ingredients in recipe of producing good quality software. Defects uncovered well in time helps in saving resources in terms of time, effort and money. However imbalanced nature of software data may hamper the resultant performance of models leading to incorrect interpretations of results. This problem has dragged attention of researchers and many solutions are proposed to overcome the effect of this problem. This paper aims to provide empirical comparison of software defect prediction models developed by using various boosting based ensemble methods on three open source JAVA projects. Four ensemble methods incorporate resampling techniques within them. Performances of models obtained are evaluated using stable metrics like Balance, G-Mean and AUC. Results show that use of resampling techniques before classifying using ensemble method has significantly improved model prediction as compared to classic boosting models. RUSBoost is the undisputed winner amongst all followed by MSMOTEBoost and SMOTEBoost.

机译：随着软件行业的不断发展，软件缺陷预测已成为生产高质量软件的关键要素之一。及时发现缺陷有助于节省时间，精力和金钱。但是，软件数据的不平衡特性可能会妨碍模型的结果性能，从而导致对结果的错误解释。这个问题引起了研究者的注意，并且提出了许多解决方案来克服这个问题的影响。本文旨在提供在三个开源JAVA项目上使用各种基于Boosting的集成方法开发的软件缺陷预测模型的经验比较。四种集成方法在其中结合了重采样技术。使用诸如Balance，G-Mean和AUC的稳定指标评估获得的模型的性能。结果表明，与传统的增强模型相比，在使用集成方法进行分类之前使用重采样技术可以显着改善模型的预测。 RUSBoost是无可争议的赢家，紧随其后的是MSMOTEBoost和SMOTEBoost。

著录项

来源
《International Conference on Cloud Computing, Data Science Engineering》|2020年|300-304|共5页
会议地点
作者
Ruchika Malhotra; Juhi Jain;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Software; Boosting; Predictive models; Java; Sensitivity; Synapses;

机译：软件;提升;预测模型; Java;敏感性;突触;

相似文献

外文文献
中文文献
专利

1. SMOTEFRIS-INFFC: Handling the challenge of borderline and noisy examples in imbalanced learning for software defect prediction [J] . Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第1aPta2期

机译：SmoTefris-Inffc：在软件缺陷预测中处理边境和嘈杂示例的挑战
2. SMOTEFRIS-INFFC: Handling the challenge of borderline and noisy examples in imbalanced learning for software defect prediction [J] . Bashir Kamal, Li Tianrui, Yohannese Chubato Wondaferaw, Ecological restoration . 2020,第1期

机译：SmoTefris-Inffc：在软件缺陷预测中处理边境和嘈杂示例的挑战
3. The Comparison of Imbalanced Data Handling Method in Software Defect Prediction [J] . Khadijah Khadijah, Priyo Sidik Sasongko Kinetik . 2020,第3期

机译：软件缺陷预测中不平衡数据处理方法的比较
4. SAGA: A Hybrid Technique to handle Imbalance Data in Software Defect Prediction [C] . Ruchika Malhotra, Ritvik Kapoor, Paridhi Saxena, IEEE Symposium on Computer Applications Industrial Electronics . 2021

机译：SAGA：一种混合技术，用于处理软件缺陷预测中的不平衡数据
5. Active learning with support vector machines for imbalanced datasets and a method for stopping active learning based on stabilizing predictions. [D] . Bloodgood, Michael. 2009

机译：支持向量机用于不平衡数据集的主动学习，以及一种基于稳定预测的主动学习停止方法。
6. JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method [O] . Lina Zhang, Chengjin Zhang, Rui Gao, -1

机译：JPPRED：使用集成学习方法根据不平衡数据预测J蛋白的类型
7. Software Defect Prediction Using AWEIG+ADACOST Bayesian Algorithm for Handling High Dimensional Data and Class Imbalance Problem [O] . Joko Suntoro, Febrian Wahyu Christanto, Henny Indriyawati 2018

机译：使用AWEIG + Adacost贝叶斯算法处理高维数据和类不平衡问题的软件缺陷预测

Handling Imbalanced Data using Ensemble Learning in Software Defect Prediction

摘要

著录项

相似文献

相关主题

期刊订阅