首页> 外文OA文献 >An optimization framework for combining ensembles of classifiers and clusterers with applications to nontransductive semisupervised learning and transfer learning.
【2h】

An optimization framework for combining ensembles of classifiers and clusterers with applications to nontransductive semisupervised learning and transfer learning.

机译:一种优化框架,用于将分类器和聚类器的集合与应用程序相结合,用于非传导性半监督学习和转移学习。

摘要

Unsupervised models can provide supplementary soft constraints to help classify new “target” data because similar instances in the target set are more likely to share the same class label. Such models can also help detect possible differences between training and target distributions, which is useful in applications where concept drift may take place, as in transfer learning settings. This article describes a general optimization framework that takes as input class membership estimates from existing classifiers learned on previously encountered “source” (or training) data, as well as a similarity matrix from a cluster ensemble operating solely on the target (or test) data to be classified, and yields a consensus labeling of the target data. More precisely, the application settings considered are nontransductive semisupervised and transfer learning scenarios where the training data are used only to build an ensemble of classifiers and are subsequently discarded before classifying the target data. The framework admits a wide range of loss functions and classification/clustering methods. It exploits properties of Bregman divergences in conjunction with Legendre duality to yield a principled and scalable approach. A variety of experiments show that the proposed framework can yield results substantially superior to those provided by naïvely applying classifiers learned on the original task to the target data. In addition, we show that the proposed approach, even not being conceptually transductive, can provide better results compared to some popular transductive learning techniques.
机译:无监督的模型可以提供补充的软约束,以帮助分类新的“目标”数据,因为目标集中的相似实例更有可能共享相同的类标签。这样的模型还可以帮助检测训练与目标分布之间可能存在的差异,这在可能发生概念漂移的应用程序(如转移学习设置)中很有用。本文介绍了一种通用的优化框架,该框架将根据从先前遇到的“源”(或训练)数据中学到的现有分类器中的估计值作为输入类成员资格估计,以及仅对目标(或测试)数据进行操作的集群集成的相似度矩阵作为输入类成员估计进行分类,并产生目标数据的共识标签。更准确地说,所考虑的应用程序设置是非传导性半监督和转移学习方案,其中训练数据仅用于构建分类器集合,随后在分类目标数据之前将其丢弃。该框架允许使用多种损失函数和分类/聚类方法。它利用布雷格曼散度的特性与勒让德对偶性相结合,产生了一种原则上可扩展的方法。各种实验表明,所提出的框架所产生的结果大大优于通过将对原始任务学习的分类器简单地应用于目标数据所提供的结果。另外,我们表明,与某些流行的跨语言学习技术相比,即使不是概念上的跨语言,提出的方法也可以提供更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号