首页> 中文期刊>计算机仿真 >网页自动分类的建模与仿真研究

网页自动分类的建模与仿真研究

     

摘要

Research data mining technology and improve the web classification accuracy. Web data has the characteristics of semi - structured, vast and high - dimension, and the traditional classification methods cannot reduce the dimension andemliminatethe redundant messege, easily causing dimension disaster problem and low web classification accuracy. In order to improve the web classification accuracy, a web automatic classification method was proposed based on principal component analysis of support vector machine. Firstly, the web data was pretreatmented and the feature vector sets were extracted. Then, the web features were reduced by principal component analysis, and the webs were classified by the support vector machine. The simulation experiments were carried out on web dataset, and the web classification accuracy is over 95% , meanwhile, the classification speed is increased. The results show that the proposed method is an effective web classification method.%研究网页自动分类是为快速找到用户所需网页.由于网络中网页数量相当大,而且网络是一种半结构化、海量、高维等文本,传统文本分类方法无法进行降维和消除冗余信息,易出现维数灾问题,网页分类准确率低,用户很难找到自己所需网页.为了提高网页分类准确率,提出基于主成分支持向量机的网页自动分类方法.首先对网页数据进行预处理,提取网页特征向量向量,消除冗余信息,然后采用主成分分析对网页特征向量进行降维处理,然后采用支持向量机对网页进行自动分类.对网页数据集进行仿真,结果表明,网页分类准确率达95%以上,网页分类速度较加,说明主成分支持向量机是一种有效的网页分类方法.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号