首页> 外文会议>International Conference on Modelling, Identification and Control >Feature selection for text classification using genetic algorithms
【24h】

Feature selection for text classification using genetic algorithms

机译:使用遗传算法的文本分类特征选择

获取原文
获取外文期刊封面目录资料

摘要

In text classification, feature selection is essential to improve the classification effectiveness. This paper provides an empirical study of a feature selection method based on genetic algorithms for different text representation methods. This feature selection algorithm can accomplish two goals: in one hand is the search of a feature subset such that the performance of classifier is best; in other hands is find a feature subset with the smallest dimensionality which achieves higher accuracy in classification. To evaluate the performance of this approach, three from the best classifiers have been selected: Naive Bayes (NB), Nearest Neighbors (KNN) and Support Vector Machines (SVMs). Our objective is to determine whether the genetic algorithms based feature selection will improve the performances in text classification with smaller size using F-measure. Experimentations were carried out on two benchmark document collections 20Newsgroups, and Reuters-21578. And the results were very interesting.
机译:在文本分类中,特征选择对于提高分类效果至关重要。本文提供了基于遗传算法的特征选择方法用于不同文本表示方法的实证研究。这种特征选择算法可以实现两个目标:一方面是对特征子集进行搜索,以使分类器的性能达到最佳。另一方面,找到具有最小维数的特征子集,该子集可实现更高的分类精度。为了评估这种方法的性能,从最佳分类器中选择了三个:朴素贝叶斯(NB),最近邻(KNN)和支持向量机(SVM)。我们的目标是确定基于遗传算法的特征选择是否可以使用F-measure来以较小的尺寸提高文本分类的性能。对两个基准文档集20Newsgroups和Reuters-21578进行了实验。结果非常有趣。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号