首页> 外文会议>International conference on web information systems and applications >Is Bigger Data Better for Defect Prediction: Examining the Impact of Data Size on Supervised and Unsupervised Defect Prediction

【24h】

Is Bigger Data Better for Defect Prediction: Examining the Impact of Data Size on Supervised and Unsupervised Defect Prediction

机译：更大的数据对缺陷预测是否更好：检查数据大小对有监督和无监督缺陷预测的影响

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Defect prediction could help software practitioners to predict the future occurrence of bugs in the software code regions. In order to improve the accuracy of defect prediction, dozens of supervised and unsupervised methods have been put forward and achieved good results in this field. One limiting factor of defect prediction is that the data size of defect data is not big, which restricts the scope of application with defect prediction models. In this study, we try to construct bigger defect datasets by merging available datasets with the same measurement dimension and check whether bigger data will bring better defect prediction performance with supervised and unsupervised models or not. The results of our experiment reveal that larger-scale dataset doesn't bring improvements of both supervised and unsupervised classifiers.

机译：缺陷预测可以帮助软件从业人员预测软件代码区域中错误的未来发生。为了提高缺陷预测的准确性，提出了数十种有监督和无监督的方法，并在该领域取得了良好的效果。缺陷预测的一个限制因素是缺陷数据的数据量不大，这限制了缺陷预测模型的应用范围。在这项研究中，我们尝试通过合并具有相同测量维度的可用数据集来构建更大的缺陷数据集，并检查更大的数据是否可以在有监督和无监督的模型下带来更好的缺陷预测性能。我们的实验结果表明，大规模数据集并没有带来监督分类器和非监督分类器的改进。

著录项

来源
《International conference on web information systems and applications 》|2019年|138-150|共13页
会议地点
作者
Xinyue Liu; Yanhui Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Defect prediction; Supervised; Classifier; Data size;

机译：缺陷预测;有监督分类器资料大小;

相似文献

外文文献
中文文献
专利

1. Software defect number prediction: Unsupervised vs supervised methods [J] . Xiang Chen, Dun Zhang, Yingquan Zhao, Information and software technology . 2019 ,第FEBa期

机译：软件缺陷数量预测：无监督与有监督方法
2. Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction [J] . Huang Qiao, Xia Xin, Lo David Empirical Software Engineering . 2019 ,第5期

机译：回顾有监督和无监督模型，以进行努力感知的及时缺陷预测
3. Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction [J] . Huang Qiao, Xia Xin, Lo David Empirical Software Engineering . 2019 ,第5期

机译：重新审视监督和无监督模型的努力知识的缺陷预测
4. Is Bigger Data Better for Defect Prediction: Examining the Impact of Data Size on Supervised and Unsupervised Defect Prediction [C] . Xinyue Liu, Yanhui Li International conference on web information systems and applications . 2019

机译：更大的数据对于缺陷预测更好：检查数据规模对监督和无监督缺陷预测的影响
5. Software defect prediction on unlabeled datasets [D] . Nam, Jaechang. 2015

机译：未标记数据集上的软件缺陷预测
6. Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions [O] . Yohan Kim, John Sidney, Søren Buus, 2014

机译：数据集的大小和组成会影响肽-MHC结合预测的性能基准的可靠性
7. First principles predictions of magneto-optical data for semiconductor defects: the case of divacancy defects in 4H-SiC [O] . Davidsson, Joel, Ivády, Viktor, Armiento, Rickard, 2017

机译：半导体磁光数据的第一原理预测缺陷：4H-siC中双相缺陷的情况

Is Bigger Data Better for Defect Prediction: Examining the Impact of Data Size on Supervised and Unsupervised Defect Prediction

摘要

著录项

相似文献

相关主题

期刊订阅