Software Defect Prediction and Localization with Attention-Based Models and Ensemble Learning

机译：基于注意力的模型和集合学习的软件缺陷预测和本地化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Software defect prediction (SDP) utilizes a trained prediction model to predict the defect proneness of code modules in a software system by mining the inherent characteristics of historical defect data. An effective model can optimize the allocation of testing resources, thus improving the quality of software products. Most previous studies use handcrafted features to represent code snippets, but the main problem is that it is difficult to capture the semantic and structural information of the code context, which is often crucial for software defect prediction. Meanwhile, most of the existing software defect prediction models cannot make predictions at the code line level, which makes it extremely arduous to provide developers with more detailed reference information. To address these issues, in this paper, we propose a model based on ensemble learning techniques and attention mechanisms to offer more comprehensive prediction information to developers by locating suspect lines of code when making method-level defect predictions. This model leverages abstract syntax trees (ASTs) as the intermediate representation of code snippets. Since the historical defect data has a striking characteristic of class-imbalance, an approach based on Self-organizing Map (SOM) clustering is employed to handle noisy data. Experimental results show that, on average, the proposed model improves the F-measure by 17.7% and AUC by 37.8%, compared with the other four machine learning algorithms.

机译：软件缺陷预测（SDP）利用训练有素的预测模型来通过挖掘历史缺陷数据的固有特征来预测软件系统中的代码模块的缺陷倾向。有效的模型可以优化测试资源的分配，从而提高软件产品的质量。最先前的研究使用手工制作的功能来代表代码片段，但主要问题是难以捕获代码上下文的语义和结构信息，这通常对软件缺陷预测至关重要。同时，大多数现有软件缺陷预测模型不能在代码线级别进行预测，这使得为开发人员提供更详细的参考信息，这使得它非常艰巨。为了解决这些问题，在本文中，我们提出了一种基于集合学习技术和注意机制的模型，通过在制作方法级缺陷预测时定位可疑的代码线来为开发人员提供更全面的预测信息。此模型利用摘要语法树（ASTS）作为代码片段的中间表示。由于历史缺陷数据具有类别不平衡的醒目特性，因此采用基于自组织地图（SOM）群集的方法来处理嘈杂的数据。实验结果表明，与其他四台机器学习算法相比，拟议模型将拟议模型提高了17.7％和AUC的37.8％。

著录项

来源
《Asia-Pacific Software Engineering Conference》|2020年|81-90|共10页
会议地点
作者
Tianhang Zhang; Qingfeng Du; Jincheng Xu; Jiechu Li; Xiaojun Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Self-organizing feature maps; Predictive models; Syntactics; Software systems; Software; Testing; Software engineering;

机译：自组织特征图;预测模型;语法;软件系统;软件;测试;软件工程;

相似文献

外文文献
中文文献
专利

1. Omni-Ensemble Learning (OEL): Utilizing Over-Bagging, Static and Dynamic Ensemble Selection Approaches for Software Defect Prediction [J] . Reza Mousavi, Mahdi Eftekhari, Farhad Rahdari International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms . 2018,第6期

机译：Omni-Ensemble学习（OEL）：利用过度袋，静态和动态的集合选择方法，用于软件缺陷预测
2. Software defect prediction using stacked denoising autoencoders and two- stage ensemble learning [J] . Haonan Tong, Bin Liu, Shihai Wang Information and software technology . 2018,第APRa期

机译：使用堆叠降噪自动编码器和两级集成学习的软件缺陷预测
3. Multiple kernel ensemble learning for software defect prediction [J] . Tiejian Wang, Zhiwu Zhang, Xiaoyuan Jing, Automated software engineering . 2016,第4期

机译：用于软件缺陷预测的多核集成学习
4. Handling Imbalanced Data using Ensemble Learning in Software Defect Prediction [C] . Ruchika Malhotra, Juhi Jain International Conference on Cloud Computing, Data Science Engineering . 2020

机译：在软件缺陷预测中使用集成学习处理不平衡数据
5. The Effects of Parameter Tuning on Machine Learning Performance in a Software Defect Prediction Context [D] . Province, Benjamin N. 2015

机译：在软件缺陷预测环境中参数调整对机器学习性能的影响
6. DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data [O] . Olivier B. Poirion, Zheng Jing, Kumardeep Chaudhary, 2021

机译：深度：使用多OMICS数据进行预后预测的深度学习和机器学习模型的集合
7. Software Defect Prediction Using Variant based Ensemble Learning and Feature Selection Techniques [O] . Umair Ali, Shabib Aftab, Ahmed Iqbal, 2020

机译：基于变量的集合学习和特征选择技术的软件缺陷预测

Software Defect Prediction and Localization with Attention-Based Models and Ensemble Learning

摘要

著录项

相似文献

相关主题

期刊订阅