首页> 外文会议>IEEE International Conference on Data Mining Workshops >Positive-Unlabeled Learning in the Face of Labeling Bias
【24h】

Positive-Unlabeled Learning in the Face of Labeling Bias

机译:面对标签偏见的积极无标签学习

获取原文

摘要

Positive-Unlabeled (PU) learning scenarios are a class of semi-supervised learning where only a fraction of the data is labeled, and all available labels are positive. The goal is to assign correct (positive and negative) labels to as much data as possible. Several important learning problems fall into the PU-learning domain, as in many cases the cost and feasibility of obtaining negative examples is prohibitive. In addition to the positive-negative disparity the overall cost of labeling these datasets typically leads to situations where the number of unlabeled examples greatly outnumbers the labeled. Accordingly, we perform several experiments, on both synthetic and real-world datasets, examining the performance of state of the art PU-learning algorithms when there is significant bias in the labeling process. We propose novel PU algorithms and demonstrate that they outperform the current state of the art on a variety of benchmarks. Lastly, we present a methodology for removing the costly parameter-tuning step in a popular PU algorithm.
机译:正面无标签(PU)学习方案是一类半监督学习,其中只有一部分数据被标记,并且所有可用标签都是正面的。目标是为尽可能多的数据分配正确的(正面和负面)标签。一些重要的学习问题属于PU学习领域,因为在许多情况下,获得负面例子的成本和可行性令人望而却步。除了正负差异之外,标记这些数据集的总成本通常会导致未标记示例的数量大大超过标记数量的情况。因此,我们在合成数据集和实际数据集上进行了多次实验,以检查在标记过程中存在明显偏差时,最新的PU学习算法的性能。我们提出了新颖的PU算法,并证明了它们在各种基准上均优于当前的最新技术。最后,我们提出了一种方法,可以消除流行的PU算法中昂贵的参数调整步骤。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号