Application of mutual information-based sequential feature selection to ISBSG mixed data

Fernandez-Diego Marta; Gonzalez-Ladron-de-Guevara Fernando

首页> 外文期刊>Software Quality Journal >Application of mutual information-based sequential feature selection to ISBSG mixed data

【24h】

Application of mutual information-based sequential feature selection to ISBSG mixed data

机译：基于互信息的顺序特征选择在ISBSG混合数据中的应用

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

There is still little research work focused on feature selection (FS) techniques including both categorical and continuous features in Software Development Effort Estimation (SDEE) literature. This paper addresses the problem of selecting the most relevant features from ISBSG (International Software Benchmarking Standards Group) dataset to be used in SDEE. The aim is to show the usefulness of splitting the ranked list of features provided by a mutual information-based sequential FS approach in two, regarding categorical and continuous features. These lists are later recombined according to the accuracy of a case-based reasoning model. Thus, four FS algorithms are compared using a complete dataset with 621 projects and 12 features from ISBSG. On the one hand, two algorithms just consider the relevance, while the remaining two follow the criterion of maximizing relevance and also minimizing redundancy between any independent feature and the already selected features. On the other hand, the algorithms that do not discriminate between continuous and categorical features consider just one list, whereas those that differentiate them use two lists that are later combined. As a result, the algorithms that use two lists present better performance than those algorithms that use one list. Thus, it is meaningful to consider two different lists of features so that the categorical features may be selected more frequently. We also suggest promoting the usage of Application Group, Project Elapsed Time, and First Data Base System features with preference over the more frequently used Development Type, Language Type, and Development Platform.

机译：很少有研究工作集中在功能选择（FS）技术上，包括软件开发工作量估算（SDEE）文献中的分类功能和连续功能。本文解决了从ISBSG（国际软件基准标准组）数据集中选择最相关的功能以用于SDEE的问题。目的是显示将基于互信息的顺序FS方法所提供的特征的排序列表在分类和连续特征方面的用途一分为二的用途。这些列表随后根据基于案例的推理模型的准确性重新组合。因此，使用具有621个项目和ISBSG的12个特征的完整数据集，对四种FS算法进行了比较。一方面，两种算法仅考虑相关性，而其余两种算法遵循最大化相关性以及最小化任何独立特征与已选择特征之间的冗余性的准则。另一方面，不区分连续特征和分类特征的算法仅考虑一个列表，而区分它们的算法则使用两个列表，这些列表随后进行组合。结果，使用两个列表的算法比使用一个列表的算法具有更好的性能。因此，有意义的是考虑两个不同的特征列表，以便可以更频繁地选择分类特征。我们还建议提高应用程序组，项目经过时间和第一个数据库系统功能的使用，而不是更常用的开发类型，语言类型和开发平台。

著录项

来源
《Software Quality Journal》 |2018年第4期|1299-1325|共27页
作者
Fernandez-Diego Marta; Gonzalez-Ladron-de-Guevara Fernando;
展开▼
作者单位

Univ Politecn Valencia, Dept Business Org, E-46022 Valencia, Spain;

Univ Politecn Valencia, Dept Business Org, E-46022 Valencia, Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Feature selection; Mutual information; ISBSG; Software development effort estimation; k-nearest neighbor;

机译：特征选择;相互信息;ISBSG;软件开发工作量估计;k最近邻;

相似文献

外文文献
中文文献
专利

1. Genetic Algorithm for the Mutual Information-Based Feature Selection in Univariate Time Series Data [J] . Siddiqi Umair F., Sait Sadiq M., Kaynak Okyay Quality Control, Transactions . 2020,第期

机译：单变量时间序列数据中的基于互信息的特征选择的遗传算法
2. Mutual Information-based Feature Selection And Partition Design In Fuzzy Rule-based Classifiers From Vague Data [J] . Luciano Sanchez, M. Rosario Suarez, J.R. Villar, International Journal of Approximate Reasoning . 2008,第3期

机译：模糊数据的模糊规则分类器中基于互信息的特征选择和分区设计
3. The application of mutual information-based feature selection and fuzzy LS-SVM-based classifier in motion classification. [J] . Yan Z, Wang Z, Xie H Computer Methods and Programs in Biomedicine: An International Journal Devoted to the Development, Implementation and Exchange of Computing Methodology and Software Systems in Biomedical Research and Medical Practice . 2008,第3期

机译：基于互信息的特征选择和基于模糊LS-SVM的分类器在运动分类中的应用。
4. Mutual Information-Based Feature Selection from Set-Valued Data [C] . Shu Wenhao, Qian Wenbin International Conference on Tools with Artificial Intelligence . 2014

机译：从集值数据中基于信息的相互特征选择
5. An Application of Mutual Information for Electrocradiogram Feature Selection [D] . Eisele, Val 2016

机译：互信息在心电图特征选择中的应用
6. Parameter Selection in Mutual Information-Based Feature Selection in Automated Diagnosis of Multiple Epilepsies Using Scalp EEG [O] . Wesley T. Kerr, Ariana Anderson, Hongjing Xia, -1

机译：使用ScalP EEG自动诊断的基于相互信息的特征选择参数选择
7. Application of mutual information-based sequential feature selection to ISBSG mixed data [O] . Marta Fernández-Diego, Fernando González-Ladrón-de-Guevara 2017

机译：将基于互信息的顺序特征选择应用于ISBSG混合数据

Application of mutual information-based sequential feature selection to ISBSG mixed data

摘要

著录项

相似文献

相关主题

期刊订阅