首页> 中文期刊> 《贵州师范学院学报》 >面向文本分类的混合特征降维策略

面向文本分类的混合特征降维策略

         

摘要

Feature dimensionality reduction has been an important research on text classification. An effective way to achieve feature dimensionality reduction is to design efficient feature selection methods. Based on the existing feature selection methods, in which the phenomenon of removing the strong features of distinction between the catego- ries ability and keeping the weak ones exists, the paper presents an efficient feature reduction algorithm, which firstly defines and quantifies features to establish the unisource feature retained set and forcibly removes the common features in all classes, and then adjusts the weights of the multi - source feature so as to achieve the target of feature reduction and improve the classification performance. Finally, a comparative analysis experiment is conducted in the Reuters - -21 578, NewsGroups corpus. The experimental result indicates that the algorithm is effective and feasible.%特征降维一直是文本分类的重要研究内容,针对现有特征选择方法中普遍存在误删除强区分类别能力特征而保留弱区分类别能力特征的现象,提出了一种有效的特征降维策略,该方法首先对特征进行了定义和量化,通过建立单源特征保留集,删除所有类中的公共特征,再对多源特征权值进行调整,从而迭到特征削减和提高分类性能的目的。在Reuters-21578,NewsGmup语料集上进行的实验对比中表明,新的降维策略是有效可行的。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号