...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Active Learning without Knowing Individual Instance Labels: A Pairwise Label Homogeneity Query Approach
【24h】

Active Learning without Knowing Individual Instance Labels: A Pairwise Label Homogeneity Query Approach

机译:不知道单个实例标签的主动学习:成对标签同质性查询方法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Traditional active learning methods require the labeler to provide a class label for each queried instance. The labelers are normally highly skilled domain experts to ensure the correctness of the provided labels, which in turn results in expensive labeling cost. To reduce labeling cost, an alternative solution is to allow nonexpert labelers to carry out the labeling task without explicitly telling the class label of each queried instance. In this paper, we propose a new active learning paradigm, in which a nonexpert labeler is only asked “whether a pair of instances belong to the same class”, namely, a pairwise label homogeneity. Under such circumstances, our active learning goal is twofold: (1) decide which pair of instances should be selected for query, and (2) how to make use of the pairwise homogeneity information to improve the active learner. To achieve the goal, we propose a “Pairwise Query on Max-flow Paths” strategy to query pairwise label homogeneity from a nonexpert labeler, whose query results are further used to dynamically update a Min-cut model (to differentiate instances in different classes). In addition, a “Confidence-based Data Selection” measure is used to evaluate data utility based on the Min-cut model’s prediction results. The selected instances, with inferred class labels, are included into the labeled set to form a closed-loop active learning process. Experimental results and comparisons with state-of-the-art methods demonstrate that our new active learning paradigm can result in good performance with nonexpert labelers.
机译:传统的主动学习方法要求贴标器为每个查询的实例提供一个类标签。贴标人员通常是技术娴熟的领域专家,以确保所提供标签的正确性,从而导致昂贵的贴标成本。为了降低标记成本,另一种解决方案是允许非专业标记人员执行标记任务,而无需明确告知每个查询实例的类标记。在本文中,我们提出了一种新的主​​动学习范式,其中仅询问非专家标记者“一对实例是否属于同一类”,即成对标记同质性。在这种情况下,我们的主动学习目标是双重的:(1)确定应该选择哪一对实例进行查询;(2)如何利用成对的同质性信息来改善主动学习者。为了实现该目标,我们提出了一种“最大流路径上的成对查询”策略,以从非专家标签器中查询成对标签的同质性,其查询结果还用于动态更新最小切割模型(以区分不同类别的实例) 。此外,“最小化数据选择”措施用于根据最小切割模型的预测结果评估数据的实用性。带有推断类标签的所选实例将包含在标签集中以形成闭环主动学习过程。实验结果和与最先进方法的比较表明,我们的新的主动学习范例可以在非专家标签机上产生良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号