Feature Selection Using Decision Tree Induction in Class level Metrics Dataset for Software Defect Predictions

机译：用于软件缺陷预测的类级度量数据集中的决策树诱导功能选择

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The importance of software testing for quality assurance cannot be over emphasized. The estimation of quality factors is important for minimizing the cost and improving the effectiveness of the software testing process. One of the quality factors is fault proneness, for which unfortunately there is no generalized technique available to effectively identify fault proneness. Many researchers have concentrated on how to select software metrics that are likely to indicate fault proneness. At the same time dimensionality reduction (feature selection of software metrics) also plays a vital role for the effectiveness of the model or best quality model. Feature selection is important for a variety of reasons such as generalization, performance, computational efficiency and feature interpretability. In this paper a new method for feature selection is proposed based on Decision Tree Induction. Relevant features are selected from the class level dataset based on decision tree classifiers used in the classification process. The attributes which form rules for the classifiers are taken as the relevant feature set or new feature set named Decision Tree Induction Rule based (DTIRB) feature set. Different classifiers are learned with this new data set obtained by decision tree induction process and achieved better performance. The performance of 18 classifiers is studied with the proposed method. Comparison is made with the Support Vector Machines (SVM) and RELIEF feature selection techniques. It is observed that the proposed method outperforms the other two for most of the classifiers considered. Overall improvement in classification process is also found with original feature set and reduced feature set. The proposed method has the advantage of easy interpretability and comprehensibility. Class level metrics dataset is used for evaluating the performance of the model. Receiver Operating Characteristics (ROC) and Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) error measures are used as the performance measures for checking effectiveness of the model.

机译：软件测试对质量保证的重要性不能强调。质量因素的估计对于最大限度地减少成本并提高软件测试过程的有效性是重要的。其中一个质量因素是错误的典范，不幸的是没有可用于有效识别故障的广义技术。许多研究人员集中了如何选择可能表示故障典范性的软件度量标准。同时减少维度减少（软件指标的特征选择）也对模型或最佳质量模型的有效性起着至关重要的作用。特征选择对于各种原因，诸如泛化，性能，计算效率和特征可解释性的各种原因非常重要。本文基于决策树诱导提出了一种新的特征选择方法。基于分类过程中使用的决策树分类器，从类级数据集中选中相关功能。将分类器规则的属性被视为基于决策树归纳规则的相关功能集或新功能集或新功能集。通过决策树感应过程获得的这种新数据集并实现了更好的性能，从而学习不同的分类器。使用该方法研究了18分类机的性能。使用支持向量机（SVM）和浮雕特征选择技术进行比较。观察到，对于大多数考虑的大多数分类器，所提出的方法优于其他两个。还发现了原始功能集和减少功能集的分类过程的总体改进。所提出的方法具有简单的可解释性和可理解性的优点。类级度量数据集用于评估模型的性能。接收器操作特性（ROC）和平均绝对误差（MAE）和根均方误差（RMSE）误差措施用作检查模型效能的性能措施。

著录项

来源
《World Congress on Engineering and Computer Science》|2010年||共6页
会议地点
作者
N. Gayatri; S. Nickolas; A. V. Reddy;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词
Classification; Decision Tree Induction; Feature Selection; Software metrics; Software Quality; ROC;

机译：分类;决策树归纳;特征选择;软件度量;软件质量;ROC;
入库时间 2022-08-21 04:21:09

相似文献

外文文献
中文文献
专利

1. Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem [J] . Catal C, Diri B Information Sciences: An International Journal . 2009,第8期

机译：研究数据集大小，指标集和功能选择技术对软件故障预测问题的影响
2. Choosing software metrics for defect prediction: an investigation on feature selection techniques [J] . Kehan Gao, Taghi M. Khoshgoftaar, Huanjing Wang, Software . 2011,第5期

机译：选择用于缺陷预测的软件指标：特征选择技术研究
3. Software Defect Prediction in Class Level Metric Aggregation Using Data Mining Techniques [J] . Reddi Kiran Kumar, S.V. Achuta Rao Research journal of applied science, engineering and technology . 2016,第7期

机译：使用数据挖掘技术的类级别度量标准聚合中的软件缺陷预测
4. Feature Selection Using Decision Tree Induction in Class level Metrics Dataset for Software Defect Predictions [C] . N. Gayatri, S. Nickolas, A. V. Reddy World Congress on Engineering and Computer Science . 2010

机译：用于软件缺陷预测的类级度量数据集中的决策树诱导功能选择
5. Improving predictive models of software quality using search-based metric selection and decision trees. [D] . Vivanco, Rodrigo. 2010

机译：使用基于搜索的指标选择和决策树来改进软件质量的预测模型。
6. Prediction of periventricular leukomalacia. Part I. Selection of hemodynamic features using logistic regression and decision tree algorithms [O] . Biswanath Samanta, Geoffrey L. Bird, Marijn Kuijpers, -1

机译：脑室周围白血球减少的预测。第一部分：使用逻辑回归和决策树算法选择血液动力学特征
7. Metrics Based Feature Selection for Software Defect Prediction [O] . Radityo Adi Nugroho, Friska Abadi, M. Reza Faisal, 2020

机译：基于指标的软件缺陷预测功能选择

Feature Selection Using Decision Tree Induction in Class level Metrics Dataset for Software Defect Predictions

摘要

著录项

相似文献

相关主题

期刊订阅