首页> 外文OA文献 >On the ability of complexity metrics to predict fault-prone classes in object-oriented systems
【2h】

On the ability of complexity metrics to predict fault-prone classes in object-oriented systems

机译:关于复杂度指标预测面向对象系统中易错类的能力

摘要

Many studies use logistic regression models to investigate the ability of complexity metrics to predict fault-prone classes. However, it is not uncommon to see the inappropriate use of performance indictors such as odds ratio in previous studies. In particular, a recent study by Olague et al. uses the odds ratio associated with one unit increase in a metric to compare the relative magnitude of the associations between individual metrics and fault-proneness. In addition, the percents of concordant, discordant, and tied pairs are used to evaluate the predictive effectiveness of a univariate logistic regression model. Their results suggest that lesser known complexity metrics such as standard deviation method complexity (SDMC) and average method complexity (AMC) are better predictors than the two commonly used metrics: lines of code (LOC) and weighted method McCabe complexity (WMC). In this paper, however, we show that (1) the odds ratio associated with one standard deviation increase, rather than one unit increase, in a metric should be used to compare the relative magnitudes of the effects of individual metrics on fault-proneness. Otherwise, misleading results may be obtained; and that (2) the connection of the percents of concordant, discordant, and tied pairs with the predictive effectiveness of a univariate logistic regression model is false, as they indeed do not depend on the model. Furthermore, we use the data collected from three versions of Eclipse to re-examine the ability of complexity metrics to predict fault-proneness. Our experimental results reveal that: (1) many metrics exhibit moderate or almost moderate ability in discriminating between fault-prone and not fault-prone classes; (2) LOC and WMC are indeed better fault-proneness predictors than SDMC and AMC; and (3) the explanatory power of other complexity metrics in addition to LOC is limited.
机译:许多研究使用逻辑回归模型来研究复杂性指标预测易错类别的能力。但是,在以前的研究中经常看到不正确使用绩效指标(例如比值比)的情况并不少见。特别是Olague等人最近的一项研究。使用与度量单位增加一个单位相关的优势比来比较各个度量和故障倾向之间关联的相对大小。此外,一致对,不一致对和联系对的百分比用于评估单变量logistic回归模型的预测有效性。他们的结果表明,与两个常用度量:代码行(LOC)和加权方法McCabe复杂度(WMC)相比,鲜为人知的复杂度度量标准(例如标准差方法复杂度(SDMC)和平均方法复杂度(AMC))是更好的预测指标。但是,在本文中,我们表明(1)应使用度量中与一个标准偏差增加而不是一个单位增加相关的优势比来比较各个度量对故障倾向的影响的相对大小。否则,可能会产生误导性的结果; (2)一致,不一致和联系对的百分比与单变量logistic回归模型的预测有效性之间的联系是错误的,因为它们确实不依赖于模型。此外,我们使用从三个版本的Eclipse收集的数据来重新检查复杂度指标预测故障倾向性的能力。我们的实验结果表明:(1)许多指标在区分易错类别和不易错类别方面表现出中等或几乎中等的能力; (2)LOC和WMC确实比SDMC和AMC更好的故障倾向预测器; (3)除LOC之外,其他复杂性指标的解释能力也受到限制。

著录项

  • 作者

    Zhou Y; Xu B; Leung H;

  • 作者单位
  • 年度 2010
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号