首页> 外文会议>Proceedings of the 2010 IEEE International Conference on Information and Automation >An algorithm for selecting Chinese features based on TF-NIDF weight
【24h】

An algorithm for selecting Chinese features based on TF-NIDF weight

机译:基于TF-NIDF权值的中文特征选择算法

获取原文
获取外文期刊封面目录资料

摘要

This article discusses the problem of selecting Chinese features based on TF-IDF weight in text categorization. TF-IDF weight is commonly used in text categorization for its simplexes. However, it can not express the relationship between a feature appearance frequency in one class and appearance frequency in other classes. To solve the problem, we designed TF-NIDF weighting method to express the relationship and computer feature weight. We also incorporated the weight into Naïve Bayesian classifier and tested it on Chinese text data. Experiments showed that Naïve Bayesian classifier with features selection based on TF-NIDF weight have a higher categorization precision than Naïve Bayesian classifier with features selection based on traditional TF-IDF weight.
机译:本文讨论了在文本分类中基于TF-IDF权重选择中文特征的问题。 TF-IDF权重因其单纯形而常用于文本分类中。但是,它不能表示一个类别中的特征出现频率与其他类别中的出现频率之间的关系。为了解决这个问题,我们设计了TF-NIDF加权方法来表达关系和计算机特征权重。我们还将权重合并到朴素贝叶斯分类器中,并在中文文本数据上进行了测试。实验表明,基于TF-NIDF权重的朴素贝叶斯分类器比基于传统TF-IDF权重的朴素贝叶斯分类器具有更高的分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号