首页> 外文期刊>Applied Soft Computing >Weighted samples based semi-supervised classification
【24h】

Weighted samples based semi-supervised classification

机译:基于加权样品的半监督分类

获取原文
获取原文并翻译 | 示例
           

摘要

Graph-based semi-supervised classification (GSSC) takes labeled and unlabeled samples as vertices in a graph, and edge weights as the similarity between samples. Most GSSC methods handle each labeled sample as equally important in the graph, and they mainly focus on optimizing the graph to improve the performance. In fact, samples are not always evenly distributed. Labeled samples close to the decision boundary of different classes are generally more important than labeled samples far away from the boundary. To account for the different importances, we propose an approach called Weighted Samples based Semi-Supervised Classification (WS3C for short). WS3C firstly executes multiple clusterings on the dataset to explore the structure of samples and summarizes these clustering results. Second, it quantifies the hard-to-cluster index of each labeled sample with respect to other samples based on the summarized results and employs the index to weight that sample. Next, it constructs a graph whose edge weights are equal to the frequency of two samples grouped into the same clusters in multiple clusterings. After that, it performs semi-supervised classification based on the constructed graph and weighted samples. Empirical study on synthesized and real datasets demonstrates that assigning labeled samples with different weights significantly improves the accuracy than equally treating labeled samples. WS3C not only has better performance than other related comparing methods, but also is robust to the input parameters. (C) 2019 Elsevier B.V. All rights reserved.
机译:基于图形的半监督分类(GSSC)将标记和未标记的样本标记为图中的顶点,以及边缘权重作为样品之间的相似性。大多数GSSC方法处理每个标记的样品在图中同样重要,主要专注于优化图表以提高性能。实际上,样品并不总是均匀分布。接近不同类别决策边界的标记样本通常比远离边界的标记样本更重要。为了考虑不同的重要性,我们提出了一种称为加权样品的基于半监督分类(简称WS3C)的方法。 WS3C首先在数据集上执行多个群集以探索样本的结构,并总结这些群集结果。其次,它根据总结结果量化了每个标记样品的难量簇索引,并采用指数来重量样品。接下来,它构造边缘权重等于分组到多个群集中的两个样本的频率的图形。之后,它基于构造的图形和加权样本执行半监督分类。合成和真实数据集的实证研究表明,分配具有不同重量的标记样品显着提高了比同样处理标记的样品的精度。 WS3C不仅具有比其他相关的比较方法更好的性能,而且对输入参数也很强大。 (c)2019年Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号