Feature selection plays an important part in improving the classification accuracy and the quality of clustering in many applications. Feature selection has been widely studied in supervised learning, but in unsupervised learning it is still relatively rare. In this paper, a novel definition of feature differentiation for identifying (determining) the relatively important features is presented, and a one-pass clustering-based feature selection approach is introduced. The new method with nearly linear time complexity selects the optimal subset according to the variation of the feature differentiation. Experimental results on UCI datasets show that our method, by removing the irrelevant or redundant features, can achieve promising classification and clustering results for most datasets. Compared with other traditional feature selection approaches the proposed algorithm has obtained similar or even better performance in terms of dimensionality reduction and classification accuracy.
展开▼