...
首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Harnessing Side Information for Classification Under Label Noise
【24h】

Harnessing Side Information for Classification Under Label Noise

机译:利用标签噪声分类的侧面信息

获取原文
获取原文并翻译 | 示例

摘要

Practical data sets often contain the label noise caused by various human factors or measurement errors, which means that a fraction of training examples might be mistakenly labeled. Such noisy labels will mislead the classifier training and severely decrease the classification performance. Existing approaches to handle this problem are usually developed through various surrogate loss functions under the framework of empirical risk minimization. However, they are only suitable for binary classification and also require strong prior knowledge. Therefore, this article treats the example features as side information and formulates the noisy label removal problem as a matrix recovery problem. We denote our proposed method as "label noise handling via side information" (LNSI). Specifically, the observed label matrix is decomposed as the sum of two parts, in which the first part reveals the true labels and can be obtained by conducting a low-rank mapping on the side information; and the second part captures the incorrect labels and is modeled by a row-sparse matrix. The merits of such formulation lie in three aspects: 1) the strong recovery ability of this strategy has been sufficiently demonstrated by intensive theoretical works on side information; 2) multi-class situations can be directly handled with the aid of learned projection matrix; and 3) only very weak assumptions are required for model design, making LNSI applicable to a wide range of practical problems. Moreover, we theoretically derive the generalization bound of LNSI and show that the expected classification error of LNSI is upper bounded. The experimental results on a variety of data sets including UCI benchmark data sets and practical data sets confirm the superiority of LNSI to state-of-the-art approaches on label noise handling.
机译:实际数据集通常包含由各种人类因素或测量误差引起的标签噪声,这意味着可能错误地标记了一小部分训练示例。这种嘈杂的标签将误导分类器培训并严重降低分类性能。处理此问题的现有方法通常是通过经验风险最小化框架下的各种替代损失功能而开发的。但是,它们仅适用于二进制分类,并且还需要强大的先验知识。因此,本文将示例特征视为侧面信息,并将噪声标签删除问题交给矩阵恢复问题。我们表示我们所提出的方法作为“通过侧面信息的标签噪声处理”(LNSI)。具体地,观察到的标签矩阵被分解为两个部分的总和,其中第一部分揭示了真实标签,并且可以通过在侧面信息上进行低秩映射来获得;第二部分捕获不正确的标签,并由行稀疏矩阵建模。这种制剂的优点在三个方面:1)通过侧面信息的密集理论作品充分证明了这一战略的强烈回收能力; 2)可以借助于学习的投影矩阵直接处理多级情况; 3)模型设计只需要非常弱的假设,使LNSI适用于各种实际问题。此外,我们理论上导出了LNSI的泛化范围,并显示LNSI的预期分类误差是上限。在包括UCI基准数据集和实际数据集的各种数据集的实验结果证实了LNSI对标签噪声处理的最先进方法的优越性。

著录项

  • 来源
  • 作者单位

    Nanjing Univ Sci & Technol Sch Comp Sci & Engn Nanjing 210094 Peoples R China|Xidian Univ State Key Lab Integrated Serv Networks Xian Peoples R China;

    Nanjing Univ Sci & Technol Sch Comp Sci & Engn Nanjing 210094 Peoples R China|Xidian Univ State Key Lab Integrated Serv Networks Xian Peoples R China;

    Nanjing Univ Sci & Technol Sch Comp Sci & Engn PCA Lab Nanjing 210094 Peoples R China|Nanjing Univ Sci & Technol Sch Comp Sci & Engn Key Lab Intelligent Percept & Syst High Dimens In Minist Educ Nanjing 210094 Peoples R China|Nanjing Univ Sci & Technol Sch Comp Sci & Engn Jiangsu Key Lab Image & Video Understanding Socia Nanjing 210094 Peoples R China;

    Univ Sydney Fac Engn UBTECH Sydney Artificial Intelligence Ctr Sch Comp Sci Darlington NSW 2008 Australia;

    Nanjing Univ Sci & Technol Sch Comp Sci & Engn PCA Lab Nanjing 210094 Peoples R China|Nanjing Univ Sci & Technol Sch Comp Sci & Engn Key Lab Intelligent Percept & Syst High Dimens In Minist Educ Nanjing 210094 Peoples R China|Nanjing Univ Sci & Technol Sch Comp Sci & Engn Jiangsu Key Lab Image & Video Understanding Socia Nanjing 210094 Peoples R China;

    Univ Sydney Fac Engn UBTECH Sydney Artificial Intelligence Ctr Sch Comp Sci Darlington NSW 2008 Australia;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Noise measurement; Matrix decomposition; Training; Computer science; Task analysis; Learning systems; Risk management; Classification; generalization bound; label noise; matrix recovery; side information;

    机译:噪声测量;矩阵分解;培训;计算机科学;任务分析;学习系统;风险管理;分类;概括;标记噪声;矩阵恢复;矩阵恢复;矩阵恢复;矩阵恢复;矩阵恢复;矩阵恢复;方面信息;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号