首页> 外文期刊>Expert systems with applications >Code smell detection and identification in imbalanced environments
【24h】

Code smell detection and identification in imbalanced environments

机译:代码闻到不平衡环境中的检测和识别

获取原文
获取原文并翻译 | 示例

摘要

Context: Code smells are sub-optimal design choices that could lower software maintainability. Objective: Previous literature did not consider an important characteristic of the smell detection problem, namely data imbalance. When considering a high number of code smell types, the number of smelly classes is likely to largely exceed the number of non-smelly ones, and vice versa. Moreover, most studies did address the smell identification problem, which is more likely to present a higher imbalance as the number of smelly classes is relatively much less than the number of non-smelly ones. Furthermore, an additional research gap in the literature consists in the fact that the number of smell type identification methods is very small compared to the detection ones.Research gap: The main challenges in smell detection and identification in an imbalanced environment are: (1) the structuring of the smell detector that should be able to deal with complex splitting boundaries and small disjuncts, (2) the design of the detector quality evaluation function that should take into account data imbalance, and (3) the efficient search for effective software metrics' thresholds that should well characterize the different smells. Furthermore, the number of smell type identification methods is very small compared to the detection ones.Method: We propose ADIODE, an effective search-based engine that is able to deal with all the above-described challenges not only for the smell detection case but also for the identification one. Indeed, ADIODE is an EA (Evolutionary Algorithm) that evolves a population of detectors encoded as ODTs (Oblique Decision Trees) using the F-measure as a fitness function. This allows ADIODE to efficiently approximate globally-optimal detectors with effective oblique splitting hyper-planes and metrics' thresholds. We note that to build the BE, each software class is parsed using a particular tool with the aim to extract its metrics' values, based on which the considered class is labeled by means of a set of existing advisors; which could be seen as a two-step construction process.Results: A comparative experimental study on six open-source software systems demonstrates the merits and the outperformance of our approach compared to four of the most representative and prominent baseline techniques available in literature. The detection results show that the F-measure of ADIODE ranges between 91.23 % and 95.24 %, and its AUC lies between 0.9273 and 0.9573. Similarly, the identification results indicate that the F-measure of ADIODE varies between 86.26 % and 94.5 %, and its AUC is between 0.8653 and 0.9531.
机译:上下文:代码气味是可能降低软件可维护性的次优设计选择。目的:以前的文献没有考虑嗅觉检测问题的重要特征,即数据不平衡。在考虑大量代码闻类型时,臭味类的数量可能很大程度上超过了非臭级的数量,反之亦然。此外,大多数研究确实解决了嗅觉识别问题,这更有可能呈现更高的不平衡,因为臭臭类的数量相对小于非臭味的数量。此外,文献中的额外研究差距包括:与检测结果相比,气味型识别方法的数量非常小。研究缺口:异味检测和识别在不平衡环境中的主要挑战是:(1)嗅觉探测器的结构能够处理复杂的分裂边界和小分裂,(2)探测器质量评估函数的设计应考虑数据不平衡,并有效地搜索有效的软件度量'应该很好地表征不同嗅觉的阈值。此外,与检测器相比,气味型识别方法的数量非常小。方法:我们提出了一种adiode,一种能够处理所有上述挑战的adiode,不仅适用于嗅觉检测案例,而且还要识别一个。实际上,ADIODE是一种EA(进化算法),其使用F-Measure作为健身功能,演变为odts(倾斜决定树)编码的探测器群体。这允许ADIODE有效地近似于具有有效倾斜分割超平面和度量阈值的全局最优检测器。我们注意到构建BE,使用特定工具解析每个软件类,该工具具有提取其度量值的值,基于所考虑的类通过一组现有顾问标记为此;这可以被视为两步施工过程。结果:六个开源软件系统的比较实验研究表明,与文学中的最多的四种最具代表性和突出的基线技术相比,我们的方法的优点和表现优于。检测结果表明,施联的F法测量范围为91.23%和95.24%,其AUC位于0.9273和0.9573之间。类似地,鉴定结果表明ADIODE的F法值在86.26%和94.5%之间变化,其AUC在0.8653和0.9531之间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号