首页> 外文会议>2014 International Conference on Data Mining and Intelligent Computing >Software maintainability prediction by data mining of software code metrics
【24h】

Software maintainability prediction by data mining of software code metrics

机译:通过软件代码指标的数据挖掘来预测软件可维护性

获取原文
获取原文并翻译 | 示例

摘要

Software maintainability is a key quality attribute that determines the success of a software product. Since software maintainability is an important attribute of software quality, accurate prediction of it can help to improve overall software quality. This paper utilizes data mining of some new predictor metrics apart from traditionally used software metrics for predicting maintainability of software systems. The prediction models are constructed using static code metric datasets of four different open source software (OSS): Lucene, JHotdraw, JEdit, and JTreeview. Lucene contain 385 classes and is of 135241 lines of code (LOC) OSS, JHotdraw contain 159 classes and is of 21802 LOC OSS, JEdit contain 275 classes and is of 104053 LOC OSS and JTreeview contain 60 classes and is of 11988 LOC OSS. The metrics were collected using two different metrics extraction tools Chidamber and Kemerer Java metric (CKJM) tool and IntelliJ IDEA. Naïve Bayes, Bayes Network, Logistic, MultiLayerPerceptron and Random Forest classifiers are used to identify the software modules that are difficult to maintain. Random forest models are found to be most useful in software maintainability prediction by data mining of software code metrics as random forest models have higher recall, precision and Area under curve (AUC) of ROC curve.
机译:软件可维护性是决定软件产品成功与否的关键质量属性。由于软件可维护性是软件质量的重要属性,因此对其进行准确的预测可以帮助提高整体软件质量。除传统使用的软件指标外,本文还利用一些新的预测指标的数据挖掘来预测软件系统的可维护性。预测模型是使用四个不同的开源软件(OSS)的静态代码度量数据集构建的:Lucene,JHotdraw,JEdit和JTreeview。 Lucene包含385个类,属于135241行代码(LOC)OSS,JHotdraw包含159个类,属于21802 LOC OSS,JEdit包含275个类,属于104053 LOC OSS,JTreeview包含60个类,属于11988 LOC OSS。使用两个不同的度量提取工具Chidamber和Kemerer Java度量(CKJM)工具和IntelliJ IDEA收集了度量。朴素贝叶斯,贝叶斯网络,物流,MultiLayerPerceptron和随机森林分类器用于识别难以维护的软件模块。通过对软件代码指标进行数据挖掘,发现随机森林模型在软件可维护性预测中最有用,因为随机森林模型具有更高的召回率,精度和ROC曲线的曲线下面积(AUC)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号