...
首页> 外文期刊>Software, IET >Low-rank representation for semi-supervised software defect prediction
【24h】

Low-rank representation for semi-supervised software defect prediction

机译:用于半监督软件缺陷预测的低秩表示

获取原文
获取原文并翻译 | 示例
           

摘要

Software defect prediction based on machine learning is an active research topic in the field of software engineering. The historical defect data in software repositories may contain noises because automatic defect collection is based on modified logs and defect reports. When the previous defect labels of modules are limited, predicting the defect-prone modules becomes a challenging problem. In this study, the authors propose a graph-based semi-supervised defect prediction approach to solve the problems of insufficient labelled data and noisy data. Graph-based semi-supervised learning methods used the labelled and unlabelled data simultaneously and consider them as the nodes of the graph at the training phase. Therefore, they solve the problem of insufficient labelled samples. To improve the stability of noisy defect data, a powerful clustering method, low-rank representation (LRR), and neighbourhood distance are used to construct the relationship graph of samples. Therefore, they propose a new semi-supervised defect prediction approach, named low-rank representation-based semi-supervised software defect prediction (LRRSSDP). The widely used datasets from NASA projects and noisy datasets are employed as test data to evaluate the performance. Experimental results show that (i) LRRSSDP outperforms several representative state-of-the-art semi-supervised defect prediction methods; and (ii) LRRSSDP can maintain robustness in noisy environments.
机译:基于机器学习的软件缺陷预测是软件工程领域中一个活跃的研究主题。软件存储库中的历史缺陷数据可能包含噪音,因为自动缺陷收集基于修改后的日志和缺陷报告。当模块的先前缺陷标签受到限制时,预测易发生缺陷的模块将成为一个具有挑战性的问题。在这项研究中,作者提出了一种基于图的半监督缺陷预测方法,以解决标记数据和噪声数据不足的问题。基于图的半监督学习方法同时使用标记和未标记的数据,并在训练阶段将它们视为图的节点。因此,它们解决了标记样品不足的问题。为了提高嘈杂数据的稳定性,使用了强大的聚类方法,低秩表示(LRR)和邻域距离来构建样本的关系图。因此,他们提出了一种新的半监督缺陷预测方法,称为基于低秩表示的半监督软件缺陷预测(LRRSSDP)。来自NASA项目的广泛使用的数据集和嘈杂的数据集被用作测试数据以评估性能。实验结果表明:(i)LRRSSDP优于几种代表性的最新半监督缺陷预测方法; (ii)LRRSSDP可以在嘈杂的环境中保持鲁棒性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号