首页> 外文会议>2011 Fifth International Symposium on Empirical Software Engineering and Measurement >Mining Static Code Metrics for a Robust Prediction of Software Defect-Proneness
【24h】

Mining Static Code Metrics for a Robust Prediction of Software Defect-Proneness

机译:挖掘静态代码度量标准,以可靠地预测软件缺陷的准确性

获取原文

摘要

Defect-proneness prediction is affected by multiple aspects including sampling bias, non-metric factors, uncertainty of models etc. These aspects often contribute to prediction uncertainty and result in variance of prediction. This paper proposes two methods of data mining static code metrics to enhance defect-proneness prediction. Given little non-metric or qualitative information extracted from software codes, we first suggest to use a robust unsupervised learning method, shared nearest neighbors (SNN) to extract the similarity patterns of the code metrics. These patterns indicate similar characteristics of the components of the same cluster that may result in introduction of similar defects. Using the similarity patterns with code metrics as predictors, defect-proneness prediction may be improved. The second method uses the Occam's windows and Bayesian model averaging to deal with model uncertainty: first, the datasets are used to train and cross-validate multiple learners and then highly qualified models are selected and integrated into a robust prediction. From a study based on 12 datasets from NASA, we conclude that our proposed solutions can contribute to a better defect-proneness prediction.
机译:缺陷倾向性预测受多个方面的影响,包括采样偏差,非度量因素,模型的不确定性等。这些方面通常会导致预测不确定性并导致预测差异。本文提出了两种数据挖掘静态代码指标的方法,以增强缺陷倾向性预测。给定很少的从软件代码中提取的非度量或定性信息,我们首先建议使用鲁棒的无监督学习方法,共享最近邻(SNN)来提取代码度量的相似性模式。这些模式表明同一簇的组件具有相似的特性,可能导致引入相似的缺陷。使用具有代码量度的相似性模式作为预测变量,可以改善缺陷倾向性预测。第二种方法使用Occam的窗口和贝叶斯模型平均来处理模型不确定性:首先,使用数据集来训练和交叉验证多个学习者,然后选择高质量的模型并将其集成到可靠的预测中。通过基于来自NASA的12个数据集的研究,我们得出结论,我们提出的解决方案可以有助于更好地预测缺陷倾向性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号