...
首页> 外文期刊>Journal of Theoretical and Applied Information Technology >IMPLICATIONS OF DISCRETIZATION TOWARDS IMPROVING CLASSIFICATION ACCURACY FOR SOFTWARE DEFECT DATA
【24h】

IMPLICATIONS OF DISCRETIZATION TOWARDS IMPROVING CLASSIFICATION ACCURACY FOR SOFTWARE DEFECT DATA

机译:离散化对提高软件缺陷数据的分类准确性的意义

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Since the advent of new software architectures, paradigms and technologies the software design and development has developed a cutting edge requirements of being on the right track in terms of software quality and reliability. This leads the prediction of defects in software at its early stages of its development. Implications of machine learning algorithms are now playing a very crucial role in classification and prediction of the possible bugs during the systems design phase. In this research work a discretization method is proposed based on the Object Oriented metrics threshold values in order to gain better classification accuracy on a given data set. For the experimentation purpose, Jedit, Lucene, tomcat, velocity, xalan and xerces software systems from NASA repositories have been considered and classification accuracies have been compared with the existing approaches with the help of open source WEKA tool. For this study, the Object Oriented CK metrics suite has been considered due to its wide applicability in software industry for software quality prediction. After experimentation it is found that Naive Bayes and Voted Perceptron, classifiers are performing well and provide highest accuracy level with the discretized dataset values. The performance of these classifiers are checked and analyzed on different performance measures like ROC, RMSE, Precision, Recall values in this research work. Result shows significant performance improvements towards classification accuracy if used with discrete features of the individual software systems.
机译:自从新的软件体系结构,范例和技术问世以来,软件设计和开发就在软件质量和可靠性方面处于正确的轨道提出了最前沿的要求。这导致了在软件开发早期阶段对软件缺陷的预测。在系统设计阶段,机器学习算法的含义在分类和预测可能的错误中起着至关重要的作用。在这项研究工作中,提出了一种基于面向对象的度量阈值的离散化方法,以便在给定的数据集上获得更好的分类精度。为了进行实验,已经考虑了来自NASA储存库的Jedit,Lucene,tomcat,speed,xalan和xerces软件系统,并借助开源WEKA工具将分类精度与现有方法进行了比较。在本研究中,已考虑了面向对象的CK度量套件,因为它在软件行业中对软件质量预测的广泛适用性。经过实验发现,朴素贝叶斯(Naive Bayes)和投票的感知器(Voted Perceptron)分类器表现良好,并且使用离散化的数据集值可提供最高的准确性。在这项研究工作中,这些分类器的性能是通过不同的性能指标(例如ROC,RMSE,Precision和Recall值)进行检查和分析的。结果表明,如果与单个软件系统的离散功能一起使用,则可以显着提高分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号