首页> 外文期刊>Science of Computer Programming >Adaptive selection of classifiers for bug prediction: A large-scale empirical analysis of its performances and a benchmark study
【24h】

Adaptive selection of classifiers for bug prediction: A large-scale empirical analysis of its performances and a benchmark study

机译:Bug预测分类器的自适应选择:对其表演的大规模实证分析和基准研究

获取原文
获取原文并翻译 | 示例
           

摘要

Bug prediction aims at locating defective source code components relying on machine learning models. Although some previous work showed that selecting the machine-learning classifier is crucial, the results are contrasting. Therefore, several ensemble techniques, i.e., approaches able to mix the output of different classifiers, have been proposed. In this paper, we present a benchmark study in which we compare the performance of seven ensemble techniques on 21 open-source software projects. Our aim is twofold. On the one hand, we aim at bridging the limitations of previous empirical studies that compared the accuracy of ensemble approaches in bug prediction. On the other hand, our goal is to verify how ensemble techniques perform in different settings such as cross- and local-project defect prediction. Our empirical experimentation results show that ensemble techniques are not a silver bullet for bug prediction. In within-project bug prediction, using ensemble techniques improves the prediction performance with respect to the best stand-alone classifier. We confirm that the models based on Validation and Voting achieve slightly better results. However, they are similar to those obtained by other ensemble techniques. Identifying buggy classes using external sources of information is still an open problem. In this setting, the use of ensemble techniques does not provide evident benefits with respect to stand-alone classifiers. The statistical analysis highlights that local and global models are mostly equivalent in terms of performance. Only one ensemble technique (i.e., ASCI) slightly exploits local learning to improve performance.
机译:BUG预测旨在定位依赖于机器学习模型的有缺陷的源代码组件。虽然一些以前的工作表明,选择机器学习分类器至关重要,但结果是对比度。因此,已经提出了几种集合技术,即能够混合不同分类器的输出的方法。在本文中,我们提出了一种基准研究,我们可以比较21个开源软件项目的七个集合技术的性能。我们的宗旨是双重。一方面,我们旨在弥合以前的实证研究的局限性,这些研究比较了Bug预测中的集合方法的精度。另一方面,我们的目标是验证合奏技术如何在不同的设置中执行,例如交叉和本地项目缺陷预测。我们的经验实验结果表明,集合技术不是错误预测的银弹。在项目内的错误预测中,使用集合技术可以改善关于最佳独立分类器的预测性能。我们确认基于验证和投票的模型达到略大的结果。然而,它们类似于其他整体技术获得的那些。使用外部信息源识别错误类仍然是一个打开的问题。在该设置中,使用集合技术并不能为独立分类提供明显的益处。统计分析突出显示,本地和全球模型在性能方面大多是相同的。只有一个集合技术(即,ASCI)略微利用本地学习以提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号