首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Graph-based boosting algorithm to learn labeled and unlabeled data
【24h】

Graph-based boosting algorithm to learn labeled and unlabeled data

机译:基于图形的促进算法,用于学习标记和未标记的数据

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Ensemble learning is an effective technique to learn the information of data by combining multiple models. But usually the combined models are supervised learning algorithms which need a lot of labeled data to tune their parameters. Some ensemble learning algorithms were proposed to exploit the information of unlabeled data. These methods had to learn the samples with pseudo-labels due to the scarcity of labeled data. But it's inevitable for the samples with pseudo-labels to bring wrong information during training process. In this paper, we will propose a novel graph-based boosting (GBB) algorithm to learn labeled and unlabeled data. GBB is a framework combining many models linearly. And pseudo-labels will not occur during training process. GBB will assign a new weighting vector for the labeled samples and a transformed similarity matrix for all samples to train the combined model at each iteration. We also extend GBB, termed as weighted GBB (WGBB), to learn imbalanced data by adding a weighting vector for the labeled data. Finally, 14 relatively balanced datasets and 22 imbalanced datasets are used to validate the performances of GBB and WGBB respectively. Experimental results illustrate that GBB can achieve a competitive performance and WGBB has an obvious advantage to handle classification problem of imbalanced data, comparing with other related algorithms. (C) 2020 Elsevier Ltd. All rights reserved.
机译:集合学习是一种通过组合多种模型来学习数据信息的有效技术。但通常,组合模型是监督学习算法,需要大量标记的数据来调整其参数。建议一些集合学习算法利用未标记数据的信息。由于标记数据的稀缺性,这些方法必须使用伪标签来学习样品。但是对于伪标签的样本是不可避免的,以便在培训过程中带来错误的信息。在本文中,我们将提出一种基于图形的促进(GBB)算法来学习标记和未标记的数据。 GBB是一种框架,即在线性地结合了许多型号。在培训过程中不会发生伪标签。 GBB将为标记的样本和变换的相似性矩阵为所有样本分配一个新的加权向量,以便在每次迭代中培训组合模型。我们还扩展GBB称为加权GBB(WGBB),通过为标记数据添加加权向量来学习不平衡数据。最后,使用14个相对平衡的数据集和22个不平衡数据集分别用于验证GBB和WGBB的性能。实验结果表明,GBB可以实现竞争性能,而WGBB具有明显的优势,可以处理不平衡数据的分类问题,与其他相关算法相比。 (c)2020 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号