首页> 外文会议>Australasian Joint Conference on Artificial Intelligence >A New Supervised Term Ranking Method for Text Categorization
【24h】

A New Supervised Term Ranking Method for Text Categorization

机译:文本分类的新监督术语排序方法

获取原文
获取外文期刊封面目录资料

摘要

In text categorization, different supervised term weighting methods have been applied to improve classification performance by weighting terms with respect to different categories, for example, Information Gain, x~2 statistic, and Odds Ratio. Prom the literature there are three term ranking methods to summarize term weights of different categories for multi-class text categorization. They are Summation, Average, and Maximum methods. In this paper we present a new term ranking method to summarize term weights, i.e. Maximum Gap. Using two different methods of information gain and x~2 statistic, we setup controlled experiments for different term ranking methods. Reuter-21578 text corpus is used as the dataset. Two popular classification algorithms SVM and Boostexter are adopted to evaluate the performance of different term ranking methods. Experimental results show that the new term ranking method performs better.
机译:在文本分类中,已经应用了不同的监督术语加权方法来通过相对于不同类别来加权术语来改善分类性能,例如信息增益,x〜2统计和赔率比。奖励文献有三个术语排名方法总结了不同类别的多级文本分类的术语权重。它们是总和,平均和最大方法。在本文中,我们提出了一种新的术语排名方法来总结术语权重,即最大差距。使用两种不同的信息增益方法和X〜2统计,我们为不同术语排名方法设置了控制实验。 Reuter-21578文本语料库用作数据集。采用两个流行的分类算法SVM和Boostexter来评估不同术语排名方法的性能。实验结果表明,新的术语排名方法表现更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号