首页> 外文会议>2010 International Conference on Multimedia Technology >An Improved Algorithm to Term Weighting in Text Classification
【24h】

An Improved Algorithm to Term Weighting in Text Classification

机译:文本分类中术语加权的一种改进算法

获取原文

摘要

The traditional TF-IDF algorithm is a common method that is used to measure feature weight in text categorization. However, the algorithm doesn''t take the distribution of feature terms in inter-class and intra-class into consideration. Consequently, the algorithm can''t effectively weigh the distribution proportion of feature items.In order to solve this problem, information entropy in inter-class and intra-class which describes the distribution of feature terms was used to revise TF-IDF weight.Compared with traditional TF-IDF algorithm,the results of simulation experiment have demonstrated that the improved TF-IDF algorithm can get better classification results.
机译:传统的TF-IDF算法是一种用于在文本分类中测量特征权重的常用方法。但是,该算法没有考虑类别间和类别内特征项的分布。因此,该算法无法有效地权衡特征项的分配比例。为了解决该问题,利用描述特征项分布的类间和类内信息熵来修正TF-IDF权重。仿真实验结果表明,与传统的TF-IDF算法相比,改进后的TF-IDF算法可以获得更好的分类效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号