首页> 外文会议>International IEEE conference on intelligent systems: Methodology, models and applications in emergent technologies >Empirical Evaluation of Selected Algorithms for Complexity-Based Classification of Software Modules and a New Model
【24h】

Empirical Evaluation of Selected Algorithms for Complexity-Based Classification of Software Modules and a New Model

机译:基于复杂性的软件模块分类和新模型的所选算法的实证评估

获取原文

摘要

Software plays a major role in many organizations. Organizational success depends partially on the quality of softwares used. In recent years, many researchers have recognized that statistical classification techniques are well-suited to develop software quality prediction models. Different statistical software quality models, using complexity metrics as early indicators of software quality, have been proposed in the past. At a high-level the problem of software categorization is to classify software modules into fault prone and non-fault prone. Indeed, a learner is given a set of training modules and the corresponding class labels (i.e fault prone or non-fault-prone), and outputs a classifier. Then, the classifier takes an unlabeled module (i.e hitherto-unseen module) and assigns it to a class. The focus of this paper is to study some selected classification techniques widely used for software categorization. Indeed, practitioners are faced with a body of approaches and literature that give several conflicting advices about the usefulness of these classification approaches. The techniques evaluated in this paper include: principal component analysis, linear discriminant analysis, multiple linear regression, logistic regression, support vector machine and finite mixture models. Moreover, we propose a Bayesian approach based on finite Dirichlet mixture models. We evaluate experimentally these approaches using a real data set. Our experimental results show that different algorithms lead to different statistically significant results.
机译:软件在许多组织中发挥着重要作用。组织成功部分取决于所用软件的质量。近年来,许多研究人员已经认识到,统计分类技术非常适合开发软件质量预测模型。不同的统计软件质量模型,使用复杂度指标作为早期的软件质量指标,过去已提出。在高级别时,软件分类问题是将软件模块分类为故障容易和非故障容易。实际上,学习者被给出了一组训练模块和相应的类标签(i.e故障容易出现故障或不容易出现故障),并输出分类器。然后,分类器采用未标记的模块(即迄今为止 - 未安装模块)并将其分配给类。本文的重点是研究一些广泛用于软件分类的选定分类技术。实际上,从业者面临着一系列方法和文学,为这些分类方法的有用性提供了几个相互冲突的建议。本文评估的技术包括:主成分分析,线性判别分析,多元线性回归,逻辑回归,支持向量机和有限混合模型。此外,我们提出了一种基于有限的Dirichlet混合物模型的贝叶斯方法。我们使用真实数据集进行实验评估这些方法。我们的实验结果表明,不同的算法导致不同的统计上显着的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号