...
首页> 外文期刊>Quality Control, Transactions >A Novel Four-Way Approach Designed With Ensemble Feature Selection for Code Smell Detection
【24h】

A Novel Four-Way Approach Designed With Ensemble Feature Selection for Code Smell Detection

机译:一种设计用于代码嗅觉检测的集合功能选择的新型四通方法

获取原文
获取原文并翻译 | 示例

摘要

Purpose: Code smells are residuals of technical debt induced by the developers. They hinder evolution, adaptability and maintenance of the software. Meanwhile, they are very beneficial in indicating the loopholes of problems and bugs in the software. Machine learning has been extensively used to predict Code Smells in research. The current study aims to optimise the prediction using Ensemble Learning and Feature Selection techniques on three open-source Java data sets. Design and Results: The work Compares four varied approaches to detect code smells using four performance measures Accuracy(P1), G-mean1 (P2), G-mean2 (P3), and F-measure (P4). The study found out that values of the performance measures did not degrade it instead of either remained same or increased with feature selection and Ensemble Learning. Random Forest turns out to be the best classifier while Correlation-based Feature selection(BFS) is best amongst Feature Selection techniques. Ensemble Learning aggregators, i.e. ET5C2 (BFS intersection Relief with classifier Random Forest), ET6C2 (BFS union Relief with classifier Random Forest), and ET5C1 (BFS intersection Relief with Bagging) and Majority Voting give best results from all the aggregation combinations studied. Conclusion: Though the results are good, but using Ensemble learning techniques needs a lot of validation for a variety of data sets before it can be standardised. The Ensemble Learning techniques also pose a challenge concerning diversity and reliability and hence needs exhaustive studies.
机译:目的:代码气味是开发人员引起的技术债务的残留物。他们妨碍了软件的演化,适应性和维护。同时,它们非常有益于表明软件中存在问题和错误的漏洞。机器学习已被广​​泛地用于预测研究中的码闻。目前的研究旨在利用在三个开源Java数据集上使用集合学习和特征选择技术来优化预测。设计和结果:工作比较了四种不同的方法来使用四种性能测量精度(P1),G-均值1(P2),G平均值2(P3)和F测量(P4)检测码味的四种方法。该研究发现,性能措施的值没有降低它,而不是用特征​​选择和集合学习仍然相同或增加。随机森林成为最好的分类器,而基于相关的特征选择(BFS)是特征选择技术中最好的。 Ensemble学习聚合器,即ET5C2(BFS交叉点救济用分类器随机林),ET6C2(BFS联盟救济用分类器随机林)和ET5C1(BFS交叉口救济用袋装)和大多数投票提供了所研究的所有聚合组合的最佳结果。结论:虽然结果很好,但是使用集合学习技术需要在标准化之前对各种数据集进行大量验证。集合学习技术还构成了多样性和可靠性的挑战,因此需要详尽的研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号