Label propagation based semi-supervised learning for software defect prediction

Zhi-Wu Zhang; Xiao-Yuan Jing; Tie-Jian Wang

首页> 外文期刊>Automated software engineering >Label propagation based semi-supervised learning for software defect prediction

【24h】

Label propagation based semi-supervised learning for software defect prediction

机译：基于标签传播的半监督学习，用于软件缺陷预测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Software defect prediction can automatically predict defect-prone software modules for efficient software test in software engineering. When the previous defect labels of modules are limited, predicting the defect-prone modules becomes a challenging problem. In static software defect prediction, there exist the similarity among software modules, a software module can be approximated by a sparse representation of the other part of the software modules, and class-imbalance problem, the number of defect-free modules is much larger than that of defective ones. In this paper, we propose to use graph based semi-supervised learning technique to predict software defect. By using Laplacian score sampling strategy for the labeled defect-free modules, we construct a class-balance labeled training dataset firstly. And then, we use a nonnegative sparse algorithm to compute the nonnegative sparse weights of a relationship graph which serve as clustering indicators. Lastly, on the nonnegative sparse graph, we use a label propagation algorithm to iteratively predict the labels of unlabeled software modules. We thus propose a nonnegative sparse graph based label propagation approach for software defect classification and prediction, which uses not only few labeled data but also abundant unlabeled ones to improve the generalization capability. We vary the size of labeled software modules from 10 to 30% of all the datasets in the widely used NASA projects. Experimental results show that the NSGLP outperforms several representative state-of-the-art semi-supervised software defect prediction methods, and it can fully exploit the characteristics of static code metrics and improve the generalization capability of the software defect prediction model.

机译：软件缺陷预测可以自动预测容易出现缺陷的软件模块，以便在软件工程中进行有效的软件测试。当模块的先前缺陷标签受到限制时，预测易发生缺陷的模块将成为一个具有挑战性的问题。在静态软件缺陷预测中，软件模块之间存在相似性，可以用软件模块其他部分的稀疏表示来近似一个软件模块，并且存在类不平衡问题，无缺陷模块的数量比有缺陷的。在本文中，我们建议使用基于图的半监督学习技术来预测软件缺陷。通过对标记的无缺陷模块使用拉普拉斯分数采样策略，我们首先构造了一个类平衡标记的训练数据集。然后，我们使用非负稀疏算法来计算关系图的非负稀疏权重，该关系图用作聚类指标。最后，在非负稀疏图中，我们使用标签传播算法来迭代预测未标记软件模块的标签。因此，我们提出了一种基于非负稀疏图的标签传播方法，用于软件缺陷的分类和预测，该方法不仅使用少量标记数据，而且使用大量未标记数据来提高泛化能力。在广泛使用的NASA项目中，我们将带有标签的软件模块的大小从所有数据集的10％更改为30％。实验结果表明，NSGLP优于几种代表性的最新半监督软件缺陷预测方法，并且可以充分利用静态代码度量的特征并提高软件缺陷预测模型的泛化能力。

著录项

来源
《Automated software engineering》 |2017年第1期|47-69|共23页
作者
Zhi-Wu Zhang; Xiao-Yuan Jing; Tie-Jian Wang;
展开▼
作者单位

School of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210003, People's Republic of China;

School of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210003, People's Republic of China,State Key Laboratory of Software Engineering, School of Computer, Wuhan University, Wuhan 430072, People's Republic of China;

State Key Laboratory of Software Engineering, School of Computer, Wuhan University, Wuhan 430072, People's Republic of China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Software defect prediction; Semi-supervised learning; Nonnegative sparse graph; Label propagation; Nonnegative sparse graph based label propagation (NSGLP);

机译：软件缺陷预测;半监督学习;非负稀疏图;标签传播;基于非负稀疏图的标签传播（NSGLP）;

相似文献

外文文献
中文文献
专利

1. Sample-based software defect prediction with active and semi-supervised learning [J] . Ming Li, Hongyu Zhang, Rongxin Wu, Automated software engineering . 2012,第2期

机译：具有主动和半监督学习的基于样本的软件缺陷预测
2. Robust Label Prediction via Label Propagation and Geodesic k-Nearest Neighbor in Online Semi-Supervised Learning [J] . Yuichiro WADA, Siqiang SU, Wataru KUMAGAI, IEICE transactions on information and systems . 2019,第8期

机译：在线半监督学习中通过标签传播和测地线 k -最近邻来进行可靠的标签预测
3. Prediction Of Alzheimer's Diagnosis Using Semi-supervised Distance Metric Learning With Label Propagation [J] . Reiji Teramoto Computational biology and chemistry . 2008,第6期

机译：使用标签传播的半监督距离度量学习预测阿尔茨海默氏病
4. Software defect prediction using semi-supervised learning with dimension reduction [C] . Lu Huihua, Cukic Bojan, Culp Mark IEEE/ACM International Conference on Automated Software Engineering . 2012

机译：使用半监督学习和降维的软件缺陷预测
5. Sluicebox: Semi-supervised learning for label prediction with concept evolution and tracking in non-stationary data streams. [D] . Parker, Brandon Shane. 2014

机译：Sluicebox：半监督学习，用于标签预测，概念演变以及在非平稳数据流中的跟踪。
6. INTEGRATING SEMI-SUPERVISED LABEL PROPAGATION AND RANDOM FORESTS FOR MULTI-ATLAS BASED HIPPOCAMPUS SEGMENTATION [O] . Qiang Zheng, Yong Fan -1

机译：集成半监督的标签传播和基于多阿特拉斯的海马区隔的随机森林
7. Robust Label Prediction via Label Propagation and Geodesic k-Nearest Neighbor in Online Semi-Supervised Learning [O] . Yuichiro WADA, Siqiang SU, Wataru KUMAGAI, 2019

机译：通过标签传播和Geodesic k - 在线半监督学习中的鲁棒标签预测

Label propagation based semi-supervised learning for software defect prediction

摘要

著录项

相似文献

相关主题

期刊订阅