首页> 中文期刊> 《计算机应用与软件》 >UCM算法及其在电子政务网页分类系统中的应用

UCM算法及其在电子政务网页分类系统中的应用

         

摘要

This paper presents UCM (UC and SVM) , a new algorithm of webpage classification for large training set. UCM combines the advantages of SVM (support vector machine) and UC (unsupervised clustering) , makes the webpage classification highly precise with faster speed. In the training stage, UCM gets clustering centres by means of UC. In the classifying stage, UCM calculates the distance between a classifying webpage and the positive centres as well as the negative centres respectively. If the difference between the two distances is large enough, the webpage will be classified by UC, otherwise by the pruned SVM. Through the application in E-government webpage classification system, UCM manifests the precision much higher than UC does and a little higher than SVM does. As to the speed, UCM acts lower than UC and far higher than SVM.%针对大规模训练集的网页分类问题提出UCM(UC and SVM)分类方法.UCM算法结合了支持向量机SVM(Support Vector Machine)与无监督聚类UC(Unsupervised Clustering)的特点,使网页分类既有较高的准确率,又有较快的分类速度.在训练阶段,UCM算法利用UC方法形成聚类中心;在分类阶段,UCM算法计算待分类网页与正例中心及反例中心的距离,若距离差较大,用UC分类,否则用SVM分类.在电子政务网页分类系统中的应用表明,UCM网页分类算法在准确率方面远高于UC,略高于SVM;在分类速度上,UCM介于UC和SVM二者之间,远大于SVM.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号