首页> 外文期刊>IAENG Internaitonal journal of computer science >Two Stage Model to Detect and Rank Software Defects on Imbalanced and Scarcity Data Sets
【24h】

Two Stage Model to Detect and Rank Software Defects on Imbalanced and Scarcity Data Sets

机译:用于对不平衡和稀缺数据集上的软件缺陷进行检测和分级的两阶段模型

获取原文
获取原文并翻译 | 示例
       

摘要

In the software quality assurance process, it is crucial to prevent defective software to be delivered to customers since it can save the maintenance cost and increase software quality and reliability. Software defect prediction is recognized as an important process to automatically detect the possibility of having an error in the software. After defects are detected, it is then needed to identify their severity levels to avoid any effects that may obstruct the whole system. There were many trials attempts to capture errors by employing traditional supervised learning techniques. However, all of them are often faced with an imbalanced issue and scarcity of data, which causes decreased prediction performance. In this paper, we present a Two-Stage Model to detect and rank defects in software. The model focuses on two tasks. First, we will capture defects by applying an unbiased SVM called "R-SVM," which reduces a bias of the majority class by using the concept of threshold adjustment. Second, the detected modules will be ranked according to their severity levels by using our algorithm called "OS-YATSI," that combines semi-supervised learning and oversampling strategy to tackle the imbalanced issue. The experiment was conducted on 15 Java programs. The result showed that the proposed model outperformed all of the traditional approaches. In the defect prediction model, R-SVM significantly outperformed others on 6 programs in terms of F1. In the defect ranking model, OS-YATSI significantly outperformed all baseline classifiers on all programs at an average of 23.75% improvement in term of macro F1.
机译:在软件质量保证过程中,至关重要的是防止有缺陷的软件交付给客户,因为它可以节省维护成本并提高软件质量和可靠性。软件缺陷预测被认为是自动检测软件中出现错误的可能性的重要过程。在检测到缺陷之后,需要确定其严重性级别,以避免可能影响整个系统的任何影响。有许多尝试通过采用传统的监督学习技术来捕获错误的尝试。但是,所有这些方法通常都面临着不平衡的问题和数据不足,从而导致预测性能下降。在本文中,我们提出了一个两阶段模型来检测和排序软件中的缺陷。该模型着重于两个任务。首先,我们将通过应用称为“ R-SVM”的无偏SVM来捕获缺陷,这将通过使用阈值调整的概念来减少多数类的偏见。其次,将使用我们称为“ OS-YATSI”的算法,根据检测到的模块的严重性级别对其进行排名,该算法结合了半监督学习和过采样策略来解决不平衡问题。实验是在15个Java程序上进行的。结果表明,所提出的模型优于所有传统方法。在缺陷预测模型中,就F1而言,R-SVM在6个程序上的性能明显优于其他程序。在缺陷排序模型中,OS-YATSI在所有程序上的性能均优于所有基线分类器,在宏F1方面平均提高了23.75%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号