首页> 外文期刊>International Journal of Performability Engineering >A Two-Stage Feature Weighting Method for Naive Bayes and Its Application in Software Defect Prediction
【24h】

A Two-Stage Feature Weighting Method for Naive Bayes and Its Application in Software Defect Prediction

机译:天真贝叶斯的两级特征加权方法及其在软件缺陷预测中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Software defect prediction (SDP) models facilitate software practitioners to find out defect-prone software modules in software. Software practitioners can then test these defect-prone software modules with limited testing resources to minimize software defects. Among various SDP models, Naive Bayes (NB) has been widely used in SDP because of its simplicity, effectiveness and robustness. The NB classifier is an effective classification approach, especially for data sets with discrete attributes. In NB, the attributes are assumed to be independent and thus equally important. However, in common practice, the attributes of software defect data sets are usually continuous or numeric, and because they are designed for different purposes, their contributions to prediction are different. Therefore, this paper proposes a new NB method called TSWNB, which contains two stages: feature (i.e. attribute) discretization and feature weighting. More specifically, for the stage of feature discretization, we make the comparison between two discretization methods i.e. equal-width discretization method and equal-frequency discretization method, and identify the most appropriate one. For the stage of feature weighting, we use the feature weighting technique to alleviate the equal importance assumption, which combines the obtained feature weights into the NB formula and its likelihood estimations. To evaluate the proposed method, we carry out experiments on 5 software defect data sets of NASA MDP provided by PROMISE repository. Three well-known classification algorithms and two feature weighting techniques are included for comparison. The experimental results reveal the effectiveness and practicability of the two-stage feature weighting method TSWNB.
机译:软件缺陷预测(SDP)模型促进软件从业者在软件中查找缺陷易于缺陷的软件模块。然后,软件从业者可以使用有限的测试资源测试这些缺陷易发的软件模块,以最大限度地减少软件缺陷。在各种SDP模型中,由于其简单,有效性和鲁棒性,幼稚的贝叶斯(NB)已广泛用于SDP。 NB分类器是一种有效的分类方法,特别是对于具有离散属性的数据集。在NB中,假设属性是独立的,因此同样重要。然而,在常见的做法中,软件缺陷数据集的属性通常是连续的或数字,因为它们被设计用于不同的目的,因此它们对预测的贡献是不同的。因此,本文提出了一种名为TSWNB的新的NB方法,其中包含两个阶段:特征(即属性)离散化和特征加权。更具体地,对于特征离散化的阶段,我们在两个离散化方法和等频离散化方法之间进行比较,并识别最合适的方法。对于特征加权的阶段,我们使用特征加权技术来缓解相同的重视假设,这将所获得的特征权重结合到Nb公式及其似然估计中。为了评估所提出的方法,我们对NaSA MDP的5个软件缺陷数据集进行实验。包括三种众所周知的分类算法和两个特征加权技术进行比较。实验结果揭示了两阶段特征加权方法TSWNB的有效性和实用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号