首页> 外文期刊>Automated software engineering >Multiple kernel ensemble learning for software defect prediction
【24h】

Multiple kernel ensemble learning for software defect prediction

机译:用于软件缺陷预测的多核集成学习

获取原文
获取原文并翻译 | 示例
       

摘要

Software defect prediction aims to predict the defect proneness of new software modules with the historical defect data so as to improve the quality of a software system. Software historical defect data has a complicated structure and a marked characteristic of class-imbalance; how to fully analyze and utilize the existing historical defect data and build more precise and effective classifiers has attracted considerable researchers' interest from both academia and industry. Multiple kernel learning and ensemble learning are effective techniques in the field of machine learning. Multiple kernel learning can map the historical defect data to a higher-dimensional feature space and make them express better, and ensemble learning can use a series of weak classifiers to reduce the bias generated by the majority class and obtain better predictive performance. In this paper, we propose to use the multiple kernel learning to predict software defect. By using the characteristics of the metrics mined from the open source software, we get a multiple kernel classifier through ensemble learning method, which has the advantages of both multiple kernel learning and ensemble learning. We thus propose a multiple kernel ensemble learning (MKEL) approach for software defect classification and prediction. Considering the cost of risk in software defect prediction, we design a new sample weight vector updating strategy to reduce the cost of risk caused by misclassifying defective modules as non-defective ones. We employ the widely used NASA MDP datasets as test data to evaluate the performance of all compared methods; experimental results show that MKEL outperforms several representative state-of-the-art defect prediction methods.
机译:软件缺陷预测旨在利用历史缺陷数据来预测新软件模块的缺陷倾向性,从而提高软件系统的质量。软件历史缺陷数据结构复杂,具有明显的类不平衡特征。如何充分地分析和利用现有的历史缺陷数据并建立更精确有效的分类器,引起了学术界和工业界的极大兴趣。多核学习和集成学习是机器学习领域中的有效技术。多核学习可以将历史缺陷数据映射到更高维度的特征空间并使它们更好地表达,并且集成学习可以使用一系列弱分类器来减少多数类所产生的偏差并获得更好的预测性能。在本文中,我们建议使用多核学习来预测软件缺陷。利用从开源软件中提取的指标的特征,通过集成学习方法获得了多内核分类器,具有多内核学习和集成学习的优势。因此,我们提出了一种用于软件缺陷分类和预测的多核集成学习(MKEL)方法。考虑到软件缺陷预测中的风险成本,我们设计了一种新的样本权重向量更新策略,以减少由于将缺陷模块错误分类为无缺陷模块而导致的风险成本。我们使用广泛使用的NASA MDP数据集作为测试数据,以评估所有比较方法的性能;实验结果表明,MKEL优于几种代表性的最新缺陷预测方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号