首页> 外文期刊>Neural computation >Improved Generative Semisupervised Learning Based on Finely Grained Component-Conditional Class Labeling
【24h】

Improved Generative Semisupervised Learning Based on Finely Grained Component-Conditional Class Labeling

机译:基于细粒度组件条件类标记的改进的生成半监督学习

获取原文
获取原文并翻译 | 示例

摘要

We introduce new inductive, generative semisupervised mixtures with more finely grained class label generation mechanisms than in previous work. Our models combine advantages of semisupervised mixtures, which achieve label extrapolation over a component, and nearest-neighbor (NN)earest-prototype (NP) classification, which achieve accurate classification in the vicinity of labeled samples or prototypes. For our NN-based method, we propose a novel two-stage stochastic data generation, with all samples first generated using a standard finite mixture and then all class labels generated, conditioned on the samples and their components of origin. This mechanism entails an underlying Markov random field, specific to each mixture component or cluster. We invoke the pseudo-likelihood formulation, which forms the basis for an approximate generalized expectation-maximization model learning algorithm. Our NP-based model overcomes a problem with the NN-based model that manifests at very low labeled fractions. Both models are advantageous when within-component class proportions are not constant over the feature space region "owned by" a component. The practicality of this scenario is borne out by experiments on UC Irvine data sets, which demonstrate significant gains in classification accuracy over previous semisupervised mixtures and also overall gains, over KNN classification. Moreover, for very small labeled fractions, our methods overall outperform supervised linear and nonlinear kernel support vector machines.
机译:与以前的工作相比,我们引入了具有更细粒度的类标记生成机制的新型归纳,生成半监督混合物。我们的模型结合了半监督混合物的优势(可在组件上实现标签外推)和最近邻(NN)/最近原型(NP)分类,可在标记的样品或原型附近实现准确的分类。对于我们的基于NN的方法,我们提出了一种新颖的两阶段随机数据生成方法,首先使用标准有限混合物生成所有样本,然后根据样本及其来源成分生成所有类别标签。这种机制需要特定于每个混合成分或簇的潜在马尔可夫随机场。我们调用伪似然公式,该公式构成了近似广义期望最大化模型学习算法的基础。我们的基于NP的模型克服了基于NN的模型的问题,该问题以极低的标记分数出现。当组件内类类别比例在组件“拥有”的特征空间区域中不恒定时,这两种模型都是有利的。通过在UC Irvine数据集上进行的实验证明了该方案的实用性,该数据集证明了与以前的半监督混合物相比,分类精度显着提高,而与KNN分类相比,总体精度也得到了提高。此外,对于很小的标记分数,我们的方法总体上优于监督的线性和非线性核支持向量机。

著录项

  • 来源
    《Neural computation》 |2012年第7期|p.1926-1966|共41页
  • 作者单位

    Department of Electrical Engineering, Pennsylvania State University, University Park, PA 16802, U.S.A.;

    Department of Electrical Engineering, Pennsylvania State University, University Park, PA 16802, U.S.A.;

    Department of Electrical Engineering and Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA 16802, U.S.A.;

    Center for NMR Research, Radiology, Pennsylvania State University College of Medicine, Hershey, PA 17033, U.S.A.;

  • 收录信息 美国《科学引文索引》(SCI);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号