To obtain more accurate text classification results,and discuss the effects of classifier on text classification,text classification method based on Term-Class Weight and Term-Class Density was proposed,which used SVM and k-NN classifier for studying.Term-Class Weight was the ratio of the total file containing the items and files containing the class file.Term-Class Density was the ratio of number of items in interest class and number of items in the entire corpus.The two characteristics were taken as a measurement for text classification.The labeled documents were classified into the known classes.The relative degree of the object was predicted using the proposed measurement.The classifier was adopted for classification of texts.The data set of 20 newsgroups was adopted in the experiments.Experimental results show that compared to other similar methods,the proposed method has higher classification accuracy and the recall and F measure performance are better,which has potential application value.%为获得更加准确的文本分类结果,讨论分类器对文本分类的影响,提出一种基于类-项权重和类-项密度的文本分类方法,使用SVM和k-NN分类器进行研究.类-项权重是指包含项的文件总量与包含项的类文件总量的比率,类-项密度是指兴趣类中项发生数量与整个语料库中项发生数量的比率,将这两个特征作为文本分类的度量方法.将标记的文件归类到已知类中,使用提出的度量方法预测所给对象的相关程度,使用分类器进行分类.对20个新闻组的数据集进行实验,实验结果表明,相比于其它同类方法,该方法拥有更高的分类精度,查全率和F测度表现优异,具有潜在的应用价值.
展开▼