Code smell detection and identification in imbalanced environments

Boutaib Sofien; Bechikh Slim; Palomba Fabio; Elarbi Maha; Makhlouf Mohamed; Ben Said Lamjed

首页> 外文期刊>Expert systems with applications >Code smell detection and identification in imbalanced environments

【24h】

Code smell detection and identification in imbalanced environments

机译：代码闻到不平衡环境中的检测和识别

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Context: Code smells are sub-optimal design choices that could lower software maintainability. Objective: Previous literature did not consider an important characteristic of the smell detection problem, namely data imbalance. When considering a high number of code smell types, the number of smelly classes is likely to largely exceed the number of non-smelly ones, and vice versa. Moreover, most studies did address the smell identification problem, which is more likely to present a higher imbalance as the number of smelly classes is relatively much less than the number of non-smelly ones. Furthermore, an additional research gap in the literature consists in the fact that the number of smell type identification methods is very small compared to the detection ones.Research gap: The main challenges in smell detection and identification in an imbalanced environment are: (1) the structuring of the smell detector that should be able to deal with complex splitting boundaries and small disjuncts, (2) the design of the detector quality evaluation function that should take into account data imbalance, and (3) the efficient search for effective software metrics' thresholds that should well characterize the different smells. Furthermore, the number of smell type identification methods is very small compared to the detection ones.Method: We propose ADIODE, an effective search-based engine that is able to deal with all the above-described challenges not only for the smell detection case but also for the identification one. Indeed, ADIODE is an EA (Evolutionary Algorithm) that evolves a population of detectors encoded as ODTs (Oblique Decision Trees) using the F-measure as a fitness function. This allows ADIODE to efficiently approximate globally-optimal detectors with effective oblique splitting hyper-planes and metrics' thresholds. We note that to build the BE, each software class is parsed using a particular tool with the aim to extract its metrics' values, based on which the considered class is labeled by means of a set of existing advisors; which could be seen as a two-step construction process.Results: A comparative experimental study on six open-source software systems demonstrates the merits and the outperformance of our approach compared to four of the most representative and prominent baseline techniques available in literature. The detection results show that the F-measure of ADIODE ranges between 91.23 % and 95.24 %, and its AUC lies between 0.9273 and 0.9573. Similarly, the identification results indicate that the F-measure of ADIODE varies between 86.26 % and 94.5 %, and its AUC is between 0.8653 and 0.9531.

机译：上下文：代码气味是可能降低软件可维护性的次优设计选择。目的：以前的文献没有考虑嗅觉检测问题的重要特征，即数据不平衡。在考虑大量代码闻类型时，臭味类的数量可能很大程度上超过了非臭级的数量，反之亦然。此外，大多数研究确实解决了嗅觉识别问题，这更有可能呈现更高的不平衡，因为臭臭类的数量相对小于非臭味的数量。此外，文献中的额外研究差距包括：与检测结果相比，气味型识别方法的数量非常小。研究缺口：异味检测和识别在不平衡环境中的主要挑战是：（1）嗅觉探测器的结构能够处理复杂的分裂边界和小分裂，（2）探测器质量评估函数的设计应考虑数据不平衡，并有效地搜索有效的软件度量'应该很好地表征不同嗅觉的阈值。此外，与检测器相比，气味型识别方法的数量非常小。方法：我们提出了一种adiode，一种能够处理所有上述挑战的adiode，不仅适用于嗅觉检测案例，而且还要识别一个。实际上，ADIODE是一种EA（进化算法），其使用F-Measure作为健身功能，演变为odts（倾斜决定树）编码的探测器群体。这允许ADIODE有效地近似于具有有效倾斜分割超平面和度量阈值的全局最优检测器。我们注意到构建BE，使用特定工具解析每个软件类，该工具具有提取其度量值的值，基于所考虑的类通过一组现有顾问标记为此;这可以被视为两步施工过程。结果：六个开源软件系统的比较实验研究表明，与文学中的最多的四种最具代表性和突出的基线技术相比，我们的方法的优点和表现优于。检测结果表明，施联的F法测量范围为91.23％和95.24％，其AUC位于0.9273和0.9573之间。类似地，鉴定结果表明ADIODE的F法值在86.26％和94.5％之间变化，其AUC在0.8653和0.9531之间。

著录项

来源
《Expert systems with applications》 |2021年第3期|114076.1-114076.26|共26页
作者
Boutaib Sofien; Bechikh Slim; Palomba Fabio; Elarbi Maha; Makhlouf Mohamed; Ben Said Lamjed;
展开▼
作者单位

Univ Tunis SMART Lab ISG Tunis Tunisia;

Univ Tunis SMART Lab ISG Tunis Tunisia;

Univ Salerno Salerno Italy;

Univ Tunis SMART Lab ISG Tunis Tunisia;

Kedge Business Sch Talence France;

Univ Tunis SMART Lab ISG Tunis Tunisia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Code smells detection; Smell type identification; Imbalanced data classification; Oblique decision tree; Evolutionary algorithm;

机译：代码闻闻检测;味道型识别;不平衡数据分类;斜决定树;进化算法;

相似文献

外文文献
中文文献
专利

1. Collaborative or individual identification of code smells? On the effectiveness of novice and professional developers [J] . Information and software technology . 2020,第Apra期

机译：协同或个人识别代码气味？关于新手和专业开发人员的效力
2. Refactoring Opportunity Identification Methodology for Removing Long Method Smells and Improving Code Analyzability [J] . Panita MEANANEATRA, Songsakdi RONGVIRIYAPANISH, Taweesup APIWATTANAPONG IEICE transactions on information and systems . 2018,第7期

机译：重构机会识别方法以消除长方法的气味并提高代码可分析性
3. Identification and Elimination of Platform-Specific Code Smells in High Performance Computing Applications [J] . Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, International Journal of Networking and Computing . 2015,第1期

机译：高性能计算应用中平台特定代码气味的识别和消除
4. When code smells twice as much: Metric-based detection of variability-aware code smells [C] . Fenske Wolfram, Schulze Sandro, Meyer Daniel, IEEE International Working Conference on Source Code Analysis and Manipulation . 2015

机译：当代码闻到两倍的气味时：基于度量的对可变性意识的代码闻到的检测
5. The effects of I/Q imbalance on the performances of QPSK in uncoded and coded schemes: Imbalance detection and compensation techniques. [D] . Wang, Yuan Dong. 2002

机译：在未编码和编码方案中，I / Q不平衡对QPSK性能的影响：不平衡检测和补偿技术。
6. Concordance between the Clinical Definition of Polypathological Patient versus Automated Detection by Means of Combined Identification through ICD-9-CM Codes [O] . Juan Gómez-Salgado, Máximo Bernabeu-Wittel, Carmen Aguilera-González, 2019

机译：多病态患者临床定义与通过ICD-9-CM编码进行组合识别的自动检测之间的一致性
7. When code smells twice as much: Metric-based detection of variability-aware code smells [O] . Wolfram Fenske, Sandro Schulze, Daniel Meyer, 2015

机译：当代码闻到两倍多：基于度量的可变性感知代码的味道
8. List Metric Detection of Coded FH/MFSK (Frequency-Hopped Multiple Frequency Shift Keying) in a Tone Jamming Environment [R] . Crepeau, P. J. 1984

机译：在语音干扰环境中列出编码FH / mFsK（跳频多频移键控）的度量检测

Code smell detection and identification in imbalanced environments

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅