Semi-supervised clustering uses the samples' supervised information to aid unsupervised learning.The samples' supervised information include class labels information and pairwise constraints information(must-link constraints and cannot-link constraints). This paper presents a semi-supervised clustering algorithm based on class labels and pairwise constraints (PLG-SSC).The algorithm contains the advantages of the genetic algorithm, and makes good use of the preceding two aspects of supervised information to help unsupervised clustering.The results of experiments on the uci data sets confirm that PLG-SSC algorithm can improve the accuracy of clustering effectively, and that it is a promising semi-supervised clustering algorithm.%半监督聚类就是利用样本的监督信息来帮助提升无监督学习的性能。样本的监督信息包括类标记信息和成对约束信息(must.1ink约束和cannot—link约束)。本文提出了一种基于类标记和成对约束的半监督聚类算法(PLG.SSC),该算法结合了遗传算法的优势,充分利用了前面两方面的监督信息来帮助无监督的聚类。在uci数据集上面的实验结果表明,PLG.SSC算法能有效地提高聚类的准确率,是一种有前景的半监督聚类算法。
展开▼