首页> 外文期刊>Journal of information science and engineering >Category Discrimination Based Feature Selection Algorithm in Chinese Text Classification
【24h】

Category Discrimination Based Feature Selection Algorithm in Chinese Text Classification

机译:基于分类的中文文本分类特征选择算法

获取原文
获取原文并翻译 | 示例
           

摘要

How to improve the classification precision is a major issue in the field of Chinese text classification. The tf-idf algorithm is a classic and widely-used feature selection algorithm based on VSM. But the traditional tf-idf algorithm neglects the feature term's distribution inside category and among categories, which causes many unreasonable selective results. This paper makes an improvement to the traditional tf-idf algorithm through the introduction of the concept of Category Discrimination. We evaluate our algorithm with experiments, and make comparisons with other algorithms. The experimental results show that the improved tf-idf algorithm consistently has a higher precision and recall compared with the traditional tf-idf algorithm, and is superior to other algorithm as a whole. Therefore, it is a more effective feature selection algorithm in text classification field.
机译:如何提高分类精度是中文文本分类领域的主要问题。 tf-idf算法是基于VSM的经典且广泛使用的特征选择算法。但是传统的tf-idf算法忽略了特征项在类别内和类别间的分布,从而导致许多不合理的选择结果。通过引入类别区分的概念,本文对传统的tf-idf算法进行了改进。我们通过实验评估算法,并与其他算法进行比较。实验结果表明,与传统的tf-idf算法相比,改进的tf-idf算法始终具有更高的精度和召回率,并且在总体上优于其他算法。因此,它是文本分类领域中一种更有效的特征选择算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号