首页> 外文期刊>Statistical Analysis and Data Mining >A flexible procedure for mixture proportion estimation in positive‐unlabeled learning
【24h】

A flexible procedure for mixture proportion estimation in positive‐unlabeled learning

机译:对正粘贴学习中混合比例估计的灵活步骤

获取原文
       

摘要

Positive‐unlabeled (PU) learning considers two samples, a positive set P with observations from only one class and an unlabeled set U with observations from two classes. The goal is to classify observations in U. Class mixture proportion estimation (MPE) in U is a key step in PU learning. Blanchard et al. showed that MPE in PU learning is a generalization of the problem of estimating the proportion of true null hypotheses in multiple testing problems. Motivated by this idea, we propose reducing the problem to one‐dimension via construction of a probabilistic classifier trained on the P and U data sets followed by application of a one‐dimensional mixture proportion method from the multiple testing literature to the observation class probabilities. The flexibility of this framework lies in the freedom to choose the classifier and the one‐dimensional MPE method. We prove consistency of two mixture proportion estimators using bounds from empirical process theory, develop tuning parameter free implementations, and demonstrate that they have competitive performance on simulated waveform data and a protein signaling problem.
机译:正面未标记的(PU)学习考虑了两个样本,一个阳性设置P,只有一个类的观察和未标记的设置u与两个类的观察结果。目标是在U中分类观察。u的课程混合比例估计(MPE)是PU学习的关键步骤。 Blanchard等人。据表明,PU学习中的MPE是估计多个测试问题中真假假设比例的问题的概括。通过这种想法,我们提出通过在P和U数据集上训练的概率分类器的构造来减少对一维的问题,然后将一维混合比例方法从多维测试文献应用到观察类概率。该框架的灵活性在于选择分类器和一维MPE方法的自由度。我们证明了使用经验过程理论的边界的两个混合比例估计器的一致性,开发调整参数自由实施,并证明它们对模拟波形数据和蛋白质信号问题具有竞争性能。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号