首页> 外文会议>International Conference on Software Engineering and Data Mining >Utilizing Category Relevancy Factor for Text Categorization
【24h】

Utilizing Category Relevancy Factor for Text Categorization

机译:利用类别相关性因子进行文本分类

获取原文

摘要

One of the main preprocessing steps for having a high performance text classifier is feature weighting. Commonly used feature weighting methods such as TF and IDF-based methods only consider the distribution of a feature in the documents) and do not consider class information for feature weighting. In this paper, we present TFCRF (Term Frequency and Category Relevancy Factor) method in which the weight of features depends on their power to discriminate the classes from each other by using class information. The results show significant improvement in the performance of SVM algorithm by using TFCRF feature weighting method in comparison to the other implemented standard feature weighting methods.
机译:具有高性能文本分类器的主要预处理步骤之一是特征加权。常用的特征加权方法,例如基于TF和IDF的方法仅考虑文档中的功能的分布),并且不考虑特征加权的类信息。在本文中,我们呈现TFCRF(术语频率和类别相关性因子)方法,其中特征的权重取决于它们通过使用类信息来区分类别的类别。结果表现出通过使用TFCRF特征加权方法的SVM算法性能的显着改善,与其他实现的标准特征加权方法相比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号