A Combined Approach for Feature Subset Selection and Size Reduction for High Dimensional Data

Anurag Dwivedi; Poonam Sharma

首页> 外文期刊>International Journal of Engineering Research and Applications >A Combined Approach for Feature Subset Selection and Size Reduction for High Dimensional Data

【24h】

A Combined Approach for Feature Subset Selection and Size Reduction for High Dimensional Data

机译：高维数据特征子集选择和尺寸缩减的组合方法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

selection of relevant feature from a given set of feature is one of the important issues in the field of data mining as well as classification. In general the dataset may contain a number of features however it is not necessary that the whole set features are important for particular analysis of decision making because the features may share the common information?s and can also be completely irrelevant to the undergoing processing. This generally happen because of improper selection of features during the dataset formation or because of improper information availability about the observed system. However in both cases the data will contain the features that will just increase the processing burden which may ultimately cause the improper outcome when used for analysis. Because of these reasons some kind of methods are required to detect and remove these features hence in this paper we are presenting an efficient approach for not just removing the unimportant features but also the size of complete dataset size. The proposed algorithm utilizes the information theory to detect the information gain from each feature and minimum span tree to group the similar features with that the fuzzy c-means clustering is used to remove the similar entries from the dataset. Finally the algorithm is tested with SVM classifier using 35 publicly available real-world high-dimensional dataset and the results shows that the presented algorithm not only reduces the feature set and data lengths but also improves the performances of the classifier.

机译：从给定的特征集中选择相关特征是数据挖掘以及分类领域中的重要问题之一。通常，数据集可以包含许多特征，但是对于特定的决策分析，整个特征集不一定是重要的，因为这些特征可以共享公共信息，并且也可以与正在进行的处理完全无关。这通常是由于在数据集形成过程中对特征的选择不当或由于有关被观察系统的信息可用性不当而引起的。但是，在这两种情况下，数据都将包含一些特征，这些特征只会增加处理负担，而在用于分析时，这些负担最终可能导致不合适的结果。由于这些原因，需要使用某种方法来检测和去除这些特征，因此在本文中，我们提出一种有效的方法，不仅去除不重要的特征，而且去除整个数据集大小的大小。所提出的算法利用信息论从每个特征和最小生成树中检测出信息增益，以对相似特征进行分组，并使用模糊c均值聚类从数据集中移除相似项。最后，利用支持向量机分类器对35个公开的现实世界高维数据集进行了测试，结果表明该算法不仅减少了特征集和数据长度，而且提高了分类器的性能。

著录项

来源
《International Journal of Engineering Research and Applications》 |2015年第9期|共页
作者
Anurag Dwivedi; Poonam Sharma;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类工业技术;
关键词

相似文献

外文文献
中文文献
专利

1. Feature Subset Selection and Ranking for Data Dimensionality Reduction [J] . Hua-Liang Wei, Billings S.A. IEEE Transactions on Pattern Analysis and Machine Intelligence . 2007,第期

机译：特征子集选择和排名以减少数据维数
2. IoT based Smart Farming : Feature subset selection for optimized high dimensional data using improved GA based approach for ELM [J] . Kale Archana P., Sonavane Shefali P. Computers and Electronics in Agriculture . 2019,第期

机译：基于IOT的智能农场：使用改进的基于GA的ELM方法进行优化的高维数据的特征子集选择
3. A Hybridization Approach for Optimal Feature Subset Selection in High Dimensional Data [J] . Sharmili K. C., Chilambuchelvan A. International Journal of Uncertainty, Fuzziness, and Knowledge-based Systems . 2018,第6期

机译：高维数据中最优特征子集选择的混合方法
4. Comparative Study of Feature Subset Selection Methods for Dimensionality Reduction on Scientific Data [C] . D. Lakshmi Padmaja, B. Vishnuvardhan IEEE International Conference on Advanced Computing . 2016

机译：用于科学数据降维的特征子集选择方法的比较研究
5. Dimensionality reduction and feature selection using a mixed-norm penalty function. [D] . Zeng, Huiwen. 2006

机译：使用混合范数惩罚函数进行降维和特征选择。
6. An Efficient Feature Subset Selection Algorithm for Classification of Multidimensional Dataset [O] . Senthilkumar Devaraj, S. Paulraj 2015

机译：多维数据集分类的有效特征子集选择算法
7. Feature Subset Selection and Ranking for Data Dimensionality Reduction [O] . Hua-liang Wei, Stephen A. Billings 2013

机译：数据维数减少的特征子集选择和排序

A Combined Approach for Feature Subset Selection and Size Reduction for High Dimensional Data

摘要

著录项

相似文献

相关主题

期刊订阅