首页> 外文会议>The 2nd International Conference on Software Engineering and Data Mining >Utilizing Category Relevancy Factor for text categorization
【24h】

Utilizing Category Relevancy Factor for text categorization

机译:利用类别相关因子进行文本分类

获取原文

摘要

One of the main preprocessing steps for having a high performance text classifier is feature weighting. Commonly used feature weighting methods such as TF and IDF-based methods only consider the distribution of a feature in the document(s) and do not consider class information for feature weighting. In this paper, we present TFCRF (Term Frequency and Category Relevancy Factor) method in which the weight of features depends on their power to discriminate the classes from each other by using class information. The results show significant improvement in the performance of SVM algorithm by using TFCRF feature weighting method in comparison to the other implemented standard feature weighting methods.
机译:具有高性能文本分类器的主要预处理步骤之一是特征权重。常用的特征加权方法(例如基于TF和基于IDF的方法)仅考虑特征在文档中的分布,而不考虑用于特征加权的类信息。在本文中,我们提出了TFCRF(术语频率和类别相关因子)方法,其中特征的权重取决于它们通过使用类别信息来区分类别的能力。结果表明,与其他已实现的标准特征加权方法相比,使用TFCRF特征加权方法可以显着提高SVM算法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号