Learning from Positive and Unlabeled Data 2: Computationally Efficient Estimation of Class Priors

Marthinus Christoffel DU PLESSIS; Gang NIU; Masashi SUGIYAMA

首页> 外文期刊>電子情報通信学会技術研究報告. 情報論的学習理論と機械学習 >Learning from Positive and Unlabeled Data 2: Computationally Efficient Estimation of Class Priors

【24h】

Learning from Positive and Unlabeled Data 2: Computationally Efficient Estimation of Class Priors

机译：从积极的和未标记的数据中学习2：班级先验的计算有效估计

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the problem of estimating the class prior in an unlabeled dataset. Under the assumption that an additional labeled dataset is available, the class prior can be estimated by fitting a mixture of class-wise data distributions to the unlabeled data distribution. However, in practice, such an additional labeled dataset is often not available. In this paper, we show that, with additional samples coming only from the positive class, the class prior of the unlabeled dataset can be estimated correctly. Our key idea is to use properly penalized divergences for model fitting to cancel the error caused by the absence of negative samples. We further show that the use of the penalized Li-distance gives a computationally efficient algorithm with an analytic solution, and establish its uniform deviation bound and estimation error bound. Finally, we experimentally demonstrate the usefulness of the proposed method.

机译：我们考虑在无标签数据集中先估计类的问题。在附加标签数据集可用的假设下，可以通过将类数据分布的混合拟合到未标签数据分布来估计类先验。但是，实际上，这样的附加标记数据集通常不可用。在本文中，我们表明，使用仅来自阳性类别的其他样本，可以正确估计未标记数据集的类别优先级。我们的关键思想是对模型拟合使用适当的罚分，以消除由于缺少负样本而导致的误差。我们进一步表明，使用罚分李距离给出了一种具有解析解的计算有效算法，并建立了其均匀偏差范围和估计误差范围。最后，我们通过实验证明了该方法的有效性。

著录项

来源
《電子情報通信学会技術研究報告. 情報論的学習理論と機械学習》 |2014年第306期|共7页
作者
Marthinus Christoffel DU PLESSIS; Gang NIU; Masashi SUGIYAMA;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类机械学（机械设计基础理论）;信息理论;
关键词
Learning from positive and unlabeled data; Class-prior estimation; Divergence matching;

机译：从阳性和未标记的数据中学习;类别优先估计;散度匹配;

相似文献

外文文献
中文文献
专利

1. Learning from Positive and Unlabeled Data 2: Computationally Efficient Estimation of Class Priors [J] . Marthinus Christoffel DU PLESSIS, Gang NIU, Masashi SUGIYAMA 電子情報通信学会技術研究報告. 情報論的学習理論と機械学習 . 2014,第306期

机译：从积极的和未标记的数据中学习2：班级先验的计算有效估计
2. Class Prior Estimation from Positive and Unlabeled Data [J] . Marthinus Christoffel DU PLESSIS, Masashi SUGIYAMA IEICE transactions on information and systems . 2014,第5期

机译：从阳性和未标记数据进行班级优先估计
3. A Positive and Unlabeled Learning Algorithm for One-Class Classification of Remote-Sensing Data [J] . Li W., Guo Q., Elkan C. Geoscience and Remote Sensing, IEEE Transactions on . 2011,第2期

机译：遥感数据的一类分类的正向和非标记学习算法
4. Learning from Positive and Unlabeled Data without Explicit Estimation of Class Prior [C] . Chenguang Zhang, Yuexian Hou, Yan Zhang AAAI Conference on Artificial Intelligence . 2020

机译：从积极和未标记的数据学习，无明确估计课程
5. A Novel Algorithm for Efficient Learning of Bayesian Networks from High-Dimensional Data and Prior Knowledge. [D] . Su, Chengwei. 2015

机译：一种从高维数据和先验知识高效学习贝叶斯网络的新算法。
6. Computational Identification of Lysine Glutarylation Sites Using Positive-Unlabeled Learning [O] . Zhe Ju, Shi-Yun Wang 2020

机译：利用正面 - 未标记学习的赖氨酸谷核位点的计算鉴定
7. Class-prior Estimation for Learning from Positive and Unlabeled Data [O] . Plessis, Marthinus C. du, Niu, Gang, Sugiyama, Masashi 2016

机译：从正数和未标记数据中学习的先验先验估计

Learning from Positive and Unlabeled Data 2: Computationally Efficient Estimation of Class Priors

摘要

著录项

相似文献

相关主题

期刊订阅