In this paper we present the experiments of a comparative study of feature selection methods used for text classification. Ten feature selection methods were evaluated in this study, including a new feature selection method, called the GU metric. The other feature selection methods evaluated in this study are: Chi-Squared (· .) statistic, NGL coefficient, GSS coefficient, Mutual Information, Information Gain, Odds Ratio, Term Frequency, Fisher Criterion, BSS/WSS coefficient. The experimental evaluations show that the GU metric obtained the best · . and · . scores. The experiments were performed on the 20 Newsgroups data sets with the Naive Probabilistic Classifier.
展开▼