首页> 外文期刊>Cybernetics, IEEE Transactions on >Active Learning With Imbalanced Multiple Noisy Labeling
【24h】

Active Learning With Imbalanced Multiple Noisy Labeling

机译:主动学习与不均衡的多重噪声标签

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

With crowdsourcing systems, it is easy to collect multiple noisy labels for the same object for supervised learning. This dynamic annotation procedure fits the active learning perspective and accompanies the imbalanced multiple noisy labeling problem. This paper proposes a novel active learning framework with multiple imperfect annotators involved in crowdsourcing systems. The framework contains two core procedures: label integration and instance selection. In the label integration procedure, a positive label threshold (PLAT) algorithm is introduced to induce the class membership from the multiple noisy label set of each instance in a training set. PLAT solves the imbalanced labeling problem by dynamically adjusting the threshold for determining the class membership of an example. Furthermore, three novel instance selection strategies are proposed to adapt PLAT for improving the learning performance. These strategies are respectively based on the uncertainty derived from the multiple labels, the uncertainty derived from the learned model, and the combination method (CFI). Experimental results on 12 datasets with different underlying class distributions demonstrate that the three novel instance selection strategies significantly improve the learning performance, and CFI has the best performance when labeling behaviors exhibit different levels of imbalance in crowdsourcing systems. We also apply our methods to a real-world scenario, obtaining noisy labels from Amazon Mechanical Turk, and show that our proposed strategies achieve very high performance.
机译:使用众包系统,可以很容易地为同一个对象收集多个嘈杂的标签,以进行监督学习。这种动态注释过程适合主动学习的观点,并伴随着不平衡的多重噪声标签问题。本文提出了一种新颖的主动学习框架,其中包含涉及众包系统的多个不完善的注释器。该框架包含两个核心过程:标签集成和实例选择。在标签集成过程中,引入了正标签阈值(PLAT)算法,以从训练集中每个实例的多个嘈杂标签集中诱发类成员资格。 PLAT通过动态调整用于确定示例的类成员资格的阈值来解决标签不平衡的问题。此外,提出了三种新颖的实例选择策略来适应PLAT以提高学习性能。这些策略分别基于多个标签衍生的不确定性,学习模型衍生的不确定性和组合方法(CFI)。在具有不同基础类别分布的12个数据集上的实验结果表明,三种新颖的实例选择策略显着提高了学习性能,而当标签行为在众包系统中表现出不同程度的失衡时,CFI表现最佳。我们还将我们的方法应用于实际情况,从Amazon Mechanical Turk获得嘈杂的标签,并表明我们提出的策略可实现非常高的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号