首页> 外文期刊>Software >Choosing software metrics for defect prediction: an investigation on feature selection techniques
【24h】

Choosing software metrics for defect prediction: an investigation on feature selection techniques

机译:选择用于缺陷预测的软件指标:特征选择技术研究

获取原文
获取原文并翻译 | 示例
       

摘要

The selection of software metrics for building software quality prediction models is a search-based software engineering problem. An exhaustive search for such metrics is usually not feasible due to limited project resources, especially if the number of available metrics is large. Defect prediction models are necessary in aiding project managers for better utilizing valuable project resources for software quality improvement. The efficacy and usefulness of a fault-proneness prediction model is only as good as the quality of the software measurement data. This study focuses on the problem of attribute selection in the context of software quality estimation. A comparative investigation is presented for evaluating our proposed hybrid attribute selection approach, in which feature ranking is first used to reduce the search space, followed by a feature subset selection. A total of seven different feature ranking techniques are evaluated, while four different feature subset selection approaches are considered. The models are trained using five commonly used classification algorithms. The case study is based on software metrics and defect data collected from multiple releases of a large real-world software system. The results demonstrate that while some feature ranking techniques performed similarly, the automatic hybrid search algorithm performed the best among the feature subset selection methods. Moreover, performances of the defect prediction models either improved or remained unchanged when over 85% of the software metrics were eliminated.
机译:用于构建软件质量预测模型的软件度量标准的选择是基于搜索的软件工程问题。由于项目资源有限,因此穷举搜索此类指标通常是不可行的,尤其是在可用指标数量很大的情况下。缺陷预测模型对于帮助项目经理更好地利用有价值的项目资源来改善软件质量是必不可少的。故障倾向性预测模型的有效性和实用性仅与软件测量数据的质量一样好。这项研究集中在软件质量评估的背景下的属性选择问题。提出了一个比较研究,以评估我们提出的混合属性选择方法,其中首先使用特征排名来减少搜索空间,然后进行特征子集选择。总共评估了七种不同的特征排名技术,同时考虑了四种不同的特征子集选择方法。使用五种常用分类算法对模型进行训练。该案例研究基于从大型实际软件系统的多个版本中收集的软件指标和缺陷数据。结果表明,尽管某些特征排名技术的执行效果相似,但自动混合搜索算法在特征子集选择方法中表现最佳。此外,当消除了超过85%的软件指标时,缺陷预测模型的性能将提高或保持不变。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号