首页> 中文期刊>计算机技术与发展 >基于聚类核的半监督情感分类算法研究

基于聚类核的半监督情感分类算法研究

     

摘要

In the rapid development of the Internet today, mankind has entered the era of big data. Text data as the carrier of human knowledge,is of great significance for human progress and development. So the usage of a large number of unlabeled samples to improve the accuracy of sentiment classification,has become more and more important. The kernel clustering method in semi supervised learning is applied to the emotion classification problem,and a semi supervised sentiment classification algorithm based on kernel clustering is pro-posed. A weighted undirected graph is built according to the labeled samples and unlabeled samples,solving the clustering kernel,and then the kernel function is used for the training of classifier SVM. This method directly uses the information contained by unlabeled samples in-to the kernel,no need to set up multiple classifiers,effective useage of the unlabeled samples. Experimental results show that the CKSVM is better than that based on Self-learning SVM and Co-training SVM in classification accuracy,with better adaptability on different data sets.%在互联网快速发展的今天,人类已经进入“大数据”时代,其中文本数据作为人类知识的载体,对于人类的进步与发展意义重大。如何运用大量未标记样本来提升文本情感分类的精度,也变得愈发重要。将半监督学习中的聚类核算法应用到情感分类问题中,给出基于聚类核的半监督情感分类算法。在标记样本和未标记样本上,建立加权无向图,求解聚类核,然后将该核函数用于SVM的情感分类器的训练上,完成情感分类工作。该方法直接将未标记样本所蕴含的信息融合到核中,不需要建立多个分类器,有效利用了未标记样本。实验结果表明,CKSVM算法在分类精度上明显优于基于Self-learning SVM和Co-training SVM的半监督情感分类算法,且在不同数据集上都有较好的适应性。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号