首页> 中文期刊> 《计算机技术与发展》 >一种基于关联分析的KNN文本分类方法

一种基于关联分析的KNN文本分类方法

         

摘要

KNN算法在数据挖掘的分支-文本分类中有重要的应用。在分析了传统KNN方法不足的基础上,提出了一种基于关联分析的KNN改进算法。该方法首先针对不同类别的训练文本提取每个类别的频繁特征集及其关联的文本,然后基于对各个类别文本的关联分析结果,为未知类别文本确定适当的近邻数k,并在已知类别的训练文本中快速选取k个近邻,进而根据近邻的类别确定未知文本的类别。相比于基于传统KNN的文本分类方法,改进方法能够较好地确定k值,并能降低时间复杂度。实验结果表明,文中提出的基于改进KNN的文本分类方法提高了文本分类的效率和准确率。%The KNN algorithm is largely applied in text classification,one branch of data mining. On the basis of analyzing the deficien-cies of the traditional KNN method,an improved KNN algorithm based on association analysis is proposed in this paper. In this method, frequent feature sets for each class of training documents and associated documents should be extracted in advance. When a document with unknown class is to be classified,by the use of the results of association analysis,the number of nearest neighbors,k can be decided,k nearest neighbors can be found quickly from all classes of training documents,and the class of the document can be decided by the classes of its neighbors. Compared with the traditional KNN algorithm,this method has greatly improved in the selection of the best number of nearest neighbors. Moreover,it can also reduce the time complexity of the algorithm. The experimental results show that the proposed al-gorithm has better efficiency and accuracy in text classification.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号