首页> 中文期刊> 《计算机工程与设计》 >基于向量投影的KNN文本分类算法

基于向量投影的KNN文本分类算法

         

摘要

Aiming at the problem of the K-nearest neighbor (KNN) in classifying, some researches are carried out to improve efficiency of KNN. An improved KNN algorithm named PKNN is proposed based on the vector projection theory and the iDistance index structure. The PKNN can make a test point get its probable nearest training points according to compare their single dimensional projection distance, the PKNN reduce training points which have nothing to do with the test point, so calculating time is saved. Results of the experiment indicated the PKNN enhance efficiency of text classification, and the PKNN is especially effective in large high-dimensional text categorization.%针对KNN算法分类时间过长的缺点,分析了提高分类效率的方法.在KNN算法基础上,结合向量投影理论以及iDistance索引结构,提出了一种改进的KNN算法--PKNN.该算法通过比较待分类样本和训练样本的一维投影距离,获得最有可能的临近样本点,减小了参与计算的训练样本数,因此可以减少每次分类的计算量.实验结果表明,PKNN算法可以明显提高KNN算法的效率,PKNN算法的原理决定其适合大容量高维文本分类.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号