首页> 外文会议>International Symposium on Empirical Software Engineering and Measurement >Mining Static Code Metrics for a Robust Prediction of Software Defect-Proneness
【24h】

Mining Static Code Metrics for a Robust Prediction of Software Defect-Proneness

机译:挖掘静态代码指标,用于软件缺陷的强大预测

获取原文

摘要

Defect-proneness prediction is affected by multiple aspects including sampling bias, non-metric factors, uncertainty of models etc. These aspects often contribute to prediction uncertainty and result in variance of prediction. This paper proposes two methods of data mining static code metrics to enhance defect-proneness prediction. Given little non-metric or qualitative information extracted from software codes, we first suggest to use a robust unsupervised learning method, shared nearest neighbors (SNN) to extract the similarity patterns of the code metrics. These patterns indicate similar characteristics of the components of the same cluster that may result in introduction of similar defects. Using the similarity patterns with code metrics as predictors, defect-proneness prediction may be improved. The second method uses the Occam's windows and Bayesian model averaging to deal with model uncertainty: first, the datasets are used to train and cross-validate multiple learners and then highly qualified models are selected and integrated into a robust prediction. From a study based on 12 datasets from NASA, we conclude that our proposed solutions can contribute to a better defect-proneness prediction.
机译:缺陷典格预测受到多个方面的影响,包括采样偏差,非度量因素,模型的不确定性等。这些方面通常有助于预测不确定性并导致预测方差。本文提出了两种数据挖掘静态代码指标来增强缺陷赘述。考虑到从软件代码中提取的很少的非度量或定性信息,我们首先建议使用强大的无监督学习方法,共享最近的邻居(SNN)来提取代码度量的相似性模式。这些模式表示相同群体的组件的类似特征,可能导致引入类似的缺陷。使用具有代码度量的相似性模式作为预测器,可以提高缺陷赘述预测。第二种方法使用COMPAN的窗口和贝叶斯模型平均来处理模型不确定性:首先,数据集用于培训和交叉验证多个学习者,然后选择高合格的模型并集成到鲁棒预测中。根据NASA的12个数据集的一项研究,我们得出的结论是,我们所提出的解决方案可以有助于更好的缺陷忠诚预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号