首页> 外文期刊>Expert Systems with Application >Text classification using genetic algorithm oriented latent semantic features
【24h】

Text classification using genetic algorithm oriented latent semantic features

机译:面向遗传算法的潜在语义特征文本分类

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, genetic algorithm oriented latent semantic features (GALSF) are proposed to obtain better representation of documents in text classification. The proposed approach consists of feature selection and feature transformation stages. The first stage is carried out using the state-of-the-art filter-based methods. The second stage employs latent semantic indexing (LSI) empowered by genetic algorithm such that a better projection is attained using appropriate singular vectors, which are not limited to the ones corresponding to the largest singular values, unlike standard LSI approach. In this way, the singular vectors with small singular values may also be used for projection whereas the vectors with large singular values may be eliminated as well to obtain better discrimination. Experimental results demonstrate that GALSF outperforms both LSI and filter-based feature selection methods on benchmark datasets for various feature dimensions.
机译:本文提出了面向遗传算法的潜在语义特征(GALSF),以获得文本分类中文档的更好表示。所提出的方法包括特征选择和特征转换阶段。第一阶段使用最先进的基于滤波器的方法进行。第二阶段使用由遗传算法授权的潜在语义索引(LSI),以便使用适当的奇异矢量获得更好的投影,与标准LSI方法不同,该奇异矢量不限于对应于最大奇异值的矢量。这样,具有小奇异值的奇异矢量也可以用于投影,而具有大奇异值的矢量也可以被消除以获得更好的辨别力。实验结果表明,GALSF在各种特征尺寸的基准数据集上均优于LSI和基于过滤器的特征选择方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号