首页> 外文期刊>International journal of software science and computational intelligence >Software Defect Prediction Based on GUHA Data Mining Procedure and MuIti-Objective Pareto Efficient Rule Selection
【24h】

Software Defect Prediction Based on GUHA Data Mining Procedure and MuIti-Objective Pareto Efficient Rule Selection

机译:基于GUHA数据挖掘程序和多目标帕累托有效规则选择的软件缺陷预测

获取原文
获取原文并翻译 | 示例

摘要

Software defect prediction, if is effective, enables the developers to distribute their testing efforts efficiently and let them focus on defect prone modules. It would be very resource consuming to test all the modules while the defect lies infraction of modules. Information about fault-proneness of classes and methods can be used to develop new strategies which can help mitigate the overall development cost and increase the customer satisfaction. Several machine learning strategies have been used in recent past to identify defective modules. These models are built using publicly available historical software defect data sets. Most of the proposed techniques are not able to deal with the class imbalance problem efficiently Therefore, it is necessary to develop a prediction model which consists of small simple and comprehensible rules. Considering these facts, in this paper, the authors propose a novel defect prediction approach named GUHA based Classification Association Rule Mining algorithm (G-CARM) where "GUHA " stands for General Unary Hypothesis Automaton. G-CARM approach is primarily based on Classification Association Rule Mining, and deploys a two stage process involving attribute discretization, and rule generation using GUHA. GUHA is oldest yet very powerful method of pattern mining. The basic idea of GUHA procedure is to mine the interesting attribute patterns that indicate defect proneness. The new method has been compared against five other models reported in recent literature viz. Naive Bayes, Support Vector Machine, RIPPER, J48 and Nearest Neighbour classifier by using several measures, including A UC and probability of detection. The experimental results indicate that the prediction performance of G-CARM approach is better than other prediction approaches. The authors' approach achieved 76% mean recall and 83% mean precision for defective modules and 93% mean recall and 83% mean precision for non-defective modules on CMI, KC1, KC2 and Eclipse data sets. Further defect rule generation process often generates a large number of rules which require considerable efforts while using these rules as a defect predictor, hence, a rule sub-set selection process is also proposed to select best set of rules according to the requirements. Evolution criteria for defect prediction like sensitivity, specificity, precision often compete against each other. It is therefore, important to use multi-objective optimization algorithms for selecting prediction rules. In this paper the authors report prediction rules that are Pareto efficient in the sense that no further improvements in the rule set is possible without sacrificing some performance criteria. Non-Dominated Sorting Genetic Algorithm has been used to find Pare to front and defect prediction rules.
机译:软件缺陷预测(如果有效)使开发人员能够有效地分配其测试工作,并使他们专注于易于出现缺陷的模块。在缺陷在于模块损坏的情况下测试所有模块将非常耗费资源。有关类和方法的易错性的信息可用于开发新的策略,从而有助于降低总体开发成本并提高客户满意度。最近,已经使用了几种机器学习策略来识别有缺陷的模块。这些模型是使用公开可用的历史软件缺陷数据集构建的。大多数提出的技术不能有效地解决类别不平衡问题,因此,有必要开发一种由小的简单且可理解的规则组成的预测模型。考虑到这些事实,在本文中,作者提出了一种新的缺陷预测方法,称为基于GUHA的分类关联规则挖掘算法(G-CARM),其中“ GUHA”代表通用一元假设自动机。 G-CARM方法主要基于分类关联规则挖掘,并部署了一个涉及属性离散化和使用GUHA生成规则的两阶段过程。 GUHA是最古老但功能非常强大的模式挖掘方法。 GUHA程序的基本思想是挖掘指示缺陷倾向性的有趣属性模式。将该新方法与最近文献中报道的其他五个模型进行了比较。朴素贝叶斯,支持向量机,RIPPER,J48和最近邻分类器,它使用了几种方法,包括A UC和检测概率。实验结果表明,G-CARM方法的预测性能优于其他预测方法。作者的方法在CMI,KC1,KC2和Eclipse数据集上,缺陷模块的平均召回率分别为76%和83%的平均精度,非缺陷模块的平均召回率为93%和83%的平均精度。进一步的缺陷规则生成过程在使用这些规则作为缺陷预测器时通常会产生大量需要大量努力的规则,因此,还提出了规则子集选择过程以根据需求选择最佳规则集。缺陷预测的进化标准(如敏感性,特异性,精度)经常相互竞争。因此,使用多目标优化算法来选择预测规则很重要。在本文中,作者报告了帕累托有效的预测规则,即在不牺牲某些性能标准的情况下不可能进一步改进规则集。非支配排序遗传算法已被用于查找Pare to front和缺陷预测规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号