首页> 外文期刊>International Journal of Performability Engineering >Impact of Hyper Parameter Optimization for Cross-Project Software Defect Prediction
【24h】

Impact of Hyper Parameter Optimization for Cross-Project Software Defect Prediction

机译:超参数优化对跨项目软件缺陷预测的影响

获取原文
获取原文并翻译 | 示例
       

摘要

Recently, most studies have considered the default value for hyper parameters of the classification methods used by cross-project defect prediction (CPDP) methods. However, in previous studies for within-project defect prediction (WPDP), researchers found that the optimization for hyper parameter helps to improve the performance of software defect prediction models. Moreover, the default value for some hyper parameters in different machine learning libraries (such as Weka, Scikit-learn) may not be consistent. To the best of our knowledge, we first conduct an in-depth analysis for the influence on the performance of CPDP by using hyper parameter optimization. Based on different classification methods, we consider 5 different instance selection based CPDP methods in total. In our empirical studies, we choose 8 projects in AEEEM and Relink datasets as our evaluation subjects, and we use AUC as our model performance measure. Final results show that among these methods, the influence of hyper parameter optimization for 4 methods is non-negligible. Among the 11 hyper parameters considered by these 5 classification methods, the influence of 8 hyper parameters is non-negligible, and these hyper parameters are mainly distributed in support vector machine and k nearest neighbor classification methods. Meanwhile, by analyzing the actual computational cost of hyper parameter optimization, we find that the spent time is within the acceptable range. These empirical results show that in the future CPDP research, the hyper parameter optimization should be considered in experimental design.
机译:最近,大多数研究都考虑了跨项目缺陷预测(CPDP)方法使用的分类方法的超参数的默认值。然而,在以前的项目内部缺陷预测(WPDP)的研究中,研究人员发现,超参数的优化有助于提高软件缺陷预测模型的性能。此外,不同机器学习库(如Weka,Scikit-Learniet)中某些超参数的默认值可能不一致。据我们所知,我们首先通过使用超参数优化对CPDP的性能的影响进行了深入的分析。基于不同的分类方法,我们考虑总基于CPDP方法的5个不同的实例选择。在我们的实证研究中,我们选择了AEEEM和Relink Datasets中的8个项目作为评估科目,我们使用AUC作为我们的模型性能措施。最终结果表明,在这些方法中,4种方法的超参数优化的影响是不可忽略的。在这5种分类方法考虑的11个超参数中,8个超参数的影响是不可忽略的,这些超参数主要分布在支持向量机和K最近邻分类方法中。同时,通过分析超参数优化的实际计算成本,我们发现花费时间在可接受的范围内。这些经验结果表明,在未来的CPDP研究中,应在实验设计中考虑超参数优化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号