...
首页> 外文期刊>Machine Learning >Learning safe multi-label prediction for weakly labeled data
【24h】

Learning safe multi-label prediction for weakly labeled data

机译:学习弱标签数据的安全多标签预测

获取原文
获取原文并翻译 | 示例
           

摘要

Abstract In this paper we study multi-label learning with weakly labeled data, i.e., labels of training examples are incomplete, which commonly occurs in real applications, e.g., image classification, document categorization. This setting includes, e.g., (i) semi-supervised multi-label learning where completely labeled examples are partially known; (ii) weak label learning where relevant labels of examples are partially known; (iii) extended weak label learning where relevant and irrelevant labels of examples are partially known. Previous studies often expect that the learning method with the use of weakly labeled data will improve the performance, as more data are employed. This, however, is not always the cases in reality, i.e., weakly labeled data may sometimes degenerate the learning performance. It is desirable to learn safe multi-label prediction that will not hurt performance when weakly labeled data is involved in the learning procedure. In this work we optimize multi-label evaluation metrics ( $$hbox {F}_1$$ F 1 score and Top- k precision) given that the ground-truth label assignment is realized by a convex combination of base multi-label learners. To cope with the infinite number of possible ground-truth label assignments, cutting-plane strategy is adopted to iteratively generate the most helpful label assignments. The whole optimization is cast as a series of simple linear programs in an efficient manner. Extensive experiments on three weakly labeled learning tasks, namely, (i) semi-supervised multi-label learning; (ii) weak label learning and (iii) extended weak label learning, clearly show that our proposal improves the safeness of using weakly labeled data compared with many state-of-the-art methods.
机译:摘要在本文中,我们研究了弱标签数据的多标签学习,即训练示例的标签不完整,这在实际应用中经常发生,例如图像分类,文档分类。此设置包括,例如,(i)半监督多标签学习,其中部分已知完全标记的示例; (ii)弱标签学习,其中部分已知示例的相关标签; (iii)扩展弱标签学习,其中部分已知示例的相关标签和不相关标签。以前的研究通常期望随着使用更多数据,使用标记较弱的数据的学习方法将提高性能。然而,实际情况并非总是如此,即,标记较弱的数据有时可能会使学习成绩下降。期望学习在学习过程中涉及弱标记数据时不会损害性能的安全多标记预测。在这项工作中,我们假设基层标签分配是通过基础多标签学习者的凸组合实现的,我们优化了多标签评估指标($$ hbox {F} _1 $$ F 1分数和Top-k精度)。为了应对无限可能的地面标签分配,采用切割平面策略来迭代生成最有用的标签分配。整个优化过程以有效的方式转换为一系列简单的线性程序。对三种弱标签学习任务的广泛实验,即(i)半监督多标签学习; (ii)弱标签学习和(iii)扩展弱标签学习,清楚地表明,与许多最新方法相比,我们的建议提高了使用弱标签数据的安全性。

著录项

  • 来源
    《Machine Learning》 |2018年第4期|703-725|共23页
  • 作者单位

    National Key Laboratory for Novel Software Technology, Nanjing University;

    National Key Laboratory for Novel Software Technology, Nanjing University;

    National Key Laboratory for Novel Software Technology, Nanjing University;

    National Key Laboratory for Novel Software Technology, Nanjing University;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Multi-label learning; Weakly labeled data; Safe; Evaluation metric;

    机译:多标签学习;标签数据不足;安全;评估指标;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号