首页> 外文会议>IEEE International Conference on Data Mining Workshops >Positive-Unlabeled Learning in the Face of Labeling Bias
【24h】

Positive-Unlabeled Learning in the Face of Labeling Bias

机译:面对标签偏见的正面未标记的学习

获取原文

摘要

Positive-Unlabeled (PU) learning scenarios are a class of semi-supervised learning where only a fraction of the data is labeled, and all available labels are positive. The goal is to assign correct (positive and negative) labels to as much data as possible. Several important learning problems fall into the PU-learning domain, as in many cases the cost and feasibility of obtaining negative examples is prohibitive. In addition to the positive-negative disparity the overall cost of labeling these datasets typically leads to situations where the number of unlabeled examples greatly outnumbers the labeled. Accordingly, we perform several experiments, on both synthetic and real-world datasets, examining the performance of state of the art PU-learning algorithms when there is significant bias in the labeling process. We propose novel PU algorithms and demonstrate that they outperform the current state of the art on a variety of benchmarks. Lastly, we present a methodology for removing the costly parameter-tuning step in a popular PU algorithm.
机译:正面未标记的(PU)学习情景是一类半监督学习,其中只标记了一小部分数据,并且所有可用的标签都是正的。目标是将正确的(正负)标签分配给尽可能多的数据。几个重要的学习问题属于PU学习领域,如在许多情况下,获得负例的成本和可行性是令人禁止的。除了正负差异之外,标签这些数据集的总成本通常会导致未标记的例子数量大大寡不一的情况。因此,我们在合成和现实世界数据集上执行若干实验,检查标签过程中存在显着偏差时的艺术PU学习算法状态的性能。我们提出了新的PU算法,并证明了它们在各种基准上越优于现有技术的现有状态。最后,我们介绍了一种用于在流行的PU算法中删除昂贵的参数调整步骤的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号