首页> 外文期刊>Computer science journal of Moldova >The impact of parameter optimization of ensemble learning on defect prediction
【24h】

The impact of parameter optimization of ensemble learning on defect prediction

机译:集成学习参数优化对缺陷预测的影响

获取原文
获取原文并翻译 | 示例
           

摘要

Machine learning algorithms have configurable parameters which are generally used with default settings by practitioners. Making modifications on the parameters of machine learning algorithm is called hyperparameter optimization (HO) performed to find out the most suitable parameter setting in classification experiments. Such studies propose either using default classification model or optimal parameter configuration. This work investigates the effects of applying HO on ensemble learning algorithms in terms of defect prediction performance. Further, this paper presents a new ensemble learning algorithm called novelEnsemble for defect prediction data sets. The method has been tested on 27 data sets. Proposed method is then compared with three alternatives. Welch's Heteroscedastic F Test is used to examine the difference between performance parameters. To control the magnitude of the difference, Cliff's Delta is applied on the results of comparison algorithms. According to the results of the experiment: 1) Ensemble methods featuring HO performs better than a single predictor; 2) Despite the error of triTraining decreases linearly, it produces errors at an unacceptable level; 3) novelEnsemble yields promising results especially in terms of area under the curve (AUC) and Matthews Correlation Coefficient (MCC); 4) HO is not stagnant depending on the scale of the data set; 5) Each ensemble learning approach may not create a favorable effect on HO. To demonstrate the prominence of hyperparameter selection process, the experiment is validated with suitable statistical analyzes. The study revealed that the success of HO which is, contrary to expectations, not depended on the type of the classifiers but rather on the design of ensemble learners.
机译:机器学习算法具有可配置的参数,实践者通常将其与默认设置一起使用。对机器学习算法的参数进行修改的过程称为超参数优化(HO),旨在找出分类实验中最合适的参数设置。这些研究建议使用默认分类模型或最佳参数配置。这项工作从缺陷预测性能的角度研究了将HO应用于集成学习算法的效果。此外,本文提出了一种用于缺陷预测数据集的新的集成学习算法,称为NovelEnsemble。该方法已在27个数据集上进行了测试。然后将提议的方法与三个替代方案进行比较。 Welch的异方差F检验用于检查性能参数之间的差异。为了控制差异的大小,在比较算法的结果上应用了Cliff的Delta。根据实验结果:1)具有HO的组合方法的性能优于单个预测器。 2)尽管triTraining的误差线性降低,但仍会产生无法接受的误差; 3)NovelEnsemble产生了令人鼓舞的结果,尤其是在曲线下面积(AUC)和马修斯相关系数(MCC)方面; 4)HO不会停滞不前,具体取决于数据集的规模; 5)每种整体学习方法可能不会对HO产生有利的影响。为了证明超参数选择过程的突出性,通过适当的统计分析验证了该实验。研究表明,与期望相反,HO的成功不取决于分类器的类型,而取决于合奏学习者的设计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号