【24h】

A Hybrid Text Classification model based on Rough Sets and Genetic Algorithms

机译:一种基于粗糙集和遗传算法的混合文本分类模型

获取原文
获取外文期刊封面目录资料

摘要

Automatic categorization of documents into predefined taxonomies is a crucial step in data mining and knowledge discovery. Standard machine learning techniques like support vector machines(SVM) and related large margin methods have been successfully applied for this task Unfortunately, the high dimensionality of input feature vectors impacts on the classification speed. The kernel parameters setting for SVM in a training process impacts on the classification accuracy. Feature selection is another factor that impacts classification accuracy. The objective of this work is to reduce the dimension of feature vectors, optimizing the parameters to improve the SVM classification accuracy and speed In order to improve classification speed we spent rough sets theory to reduce the feature vector space. We present a genetic algorithm approach for feature selection and parameters optimization to improve classification accuracy. Experimental results indicate our method is more effective than traditional SVM methods and other traditional methods.
机译:将文档自动分类为预定分类学,是数据挖掘和知识发现的重要步骤。标准机器学习技术如支持向量机(SVM)和相关的大型裕度方法已成功应用此项任务,输入特征向量的高维度对分类速度影响。 SVM在培训过程中的内核参数设置对分类准确性的影响。特征选择是影响分类准确性的另一个因素。这项工作的目的是减少特征向量的尺寸,优化参数以提高SVM分类精度和速度,以提高分类速度,我们花了粗糙集理论来减少特征矢量空间。我们提出了一种遗传算法方法,用于提高分类准确性的特征选择和参数优化方法。实验结果表明,我们的方法比传统的SVM方法和其他传统方法更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号