首页> 外文会议>International conference on artificial intelligence and soft computing >Selection of Relevant Features for Text Classification with K-NN
【24h】

Selection of Relevant Features for Text Classification with K-NN

机译:用K-NN选择文本分类的相关特征

获取原文

摘要

In this paper, we describe five features selection techniques used for a text classification. An information gain, independent significance feature test, chi-squared test, odds ratio test, and frequency filtering have been compared according to the text benchmarks based on Wikipedia. For each method we present the results of classification quality obtained on the test datasets using K-NN based approach. A main advantage of evaluated approach is reducing the dimensionality of the vector space that allows to improve effectiveness of classification task. The information gain method, that obtained the best results, has been used for evaluation of features selection and classification scalability. We also provide the results indicating the feature selection is also useful for obtaining the common-sense features for describing natural-made categories.
机译:在本文中,我们描述了用于文本分类的五种特征选择技术。根据基于Wikipedia的文本基准,对信息增益,独立显着性特征检验,卡方检验,比值比检验和频率滤波进行了比较。对于每种方法,我们都使用基于K-NN的方法介绍了在测试数据集上获得的分类质量的结果。评估方法的主要优点是减小向量空间的维数,从而可以提高分类任务的有效性。获得最佳结果的信息获取方法已用于评估特征选择和分类可伸缩性。我们还提供了结果,表明特征选择对于获得用于描述自然类别的常识特征也很有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号