...
首页> 外文期刊>Information and software technology >Code smell detection using feature selection and stacking ensemble: An empirical investigation
【24h】

Code smell detection using feature selection and stacking ensemble: An empirical investigation

机译:使用特征选择和堆叠集合的代码闻到:实证调查

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Context: Code smell detection is the process of identifying code pieces that are poorly designed and implemented. Recently more research has been directed towards machine learning-based approaches for code smells detection. Many classifiers have been explored in the literature, yet, finding an effective model to detect different code smells types has not yet been achieved. Objective: The main objective of this paper is to empirically investigate the capabilities of stacking heterogeneous ensemble model in code smell detection. Methods: Gain feature selection technique was applied to select relevant features in code smell detection. Detection performance of 14 individual classifiers was investigated in the context of two class-level and four method-level code smells. Then, three stacking ensembles were built using all individual classifiers as base classifiers, and three different meta-classifiers (LR, SVM and DT). Results: GP, MLP, DT and SVM(Lin) classifiers were among the best performing classifiers in detecting most of the code smells. On the other hand, SVM(Sig), NB(B), NB(M), and SGD were among the least accurate classifiers for most smell types. The stacking ensemble with LR and SVM meta-classifiers achieved a consistent high detection performance in class-level and method-level code smells compared to all individual models. Conclusion: This paper concludes that the detection performance of the majority of individual classifiers varied from one code smell type to another. However, the detection performance of the stacking ensemble with LR and SVM meta-classifiers was consistently superior over all individual classifiers in detecting different code smell types.
机译:上下文:代码气味检测是识别设计和实现不良的码片的过程。最近更多的研究已经针对基于机器学习的代码味道检测的方法。在文献中探讨了许多分类器,但发现尚未实现有效模型以检测不同的代码味道类型。目的:本文的主要目的是经验探讨代码闻杂志检测中堆叠异构集合模型的能力。方法:应用增益特征选择技术以在代码闻检测中选择相关特征。在两个类级和四种方法级代码气味中研究了14个单独分类器的检测性能。然后,使用所有单独的分类器作为基本分类器和三种不同的元分类器(LR,SVM和DT)构建了三个堆叠集合。结果:GP,MLP,DT和SVM(LIN)分类器是检测大部分代码气味的最佳性分类器之一。另一方面,SVM(SIG),Nb(B),Nb(M)和SGD是最含量的最低味道类型的准确分类器。与所有单个模型相比,具有LR和SVM元分类器的堆叠集合在类级和方法级代码气味中实现了一致的高检测性能。结论:本文得出结论,大多数单个分类器的检测性能从一个代码味道类型变为另一个代码。然而,用LR和SVM元分类器的堆叠集合的检测性能在检测不同代码空间类型的所有单个分类器上始终如一地优异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号