首页> 外文会议>2012 Brazilian Conference on Neural Networks. >Particle Competition and Cooperation to Prevent Error Propagation from Mislabeled Data in Semi-supervised Learning
【24h】

Particle Competition and Cooperation to Prevent Error Propagation from Mislabeled Data in Semi-supervised Learning

机译:粒子竞争与合作,以防止在半监督学习中错误标记数据中的错误传播

获取原文
获取原文并翻译 | 示例

摘要

Semi-supervised learning is applied to classification problems where only a small portion of the data items is labeled. In these cases, the reliability of the labels is a crucial factor, because mislabeled items may propagate wrong labels to a large portion or even the entire data set. This paper aims to address this problem by presenting a graph-based (network-based) semi-supervised learning method, specifically designed to handle data sets with mislabeled samples. The method uses teams of walking particles, with competitive and cooperative behavior, for label propagation in the network constructed from the input data set. The proposed model is nature-inspired and it incorporates some features to make it robust to a considerable amount of mislabeled data items. Computer simulations show the performance of the method in the presence of different percentage of mislabeled data, in networks of different sizes and average node degree. Importantly, these simulations reveals the existence of the critical points of the mislabeled subset size, below which the network is free of wrong label contamination, but above which the mislabeled samples start to propagate their labels to the rest of the network. Moreover, numerical comparisons have been made among the proposed method and other representative graph-based semi-supervised learning methods using both artificial and real-world data sets. Interestingly, the proposed method has increasing better performance than the others as the percentage of mislabeled samples is getting larger.
机译:半监督学习应用于分类问题,其中仅一小部分数据项被标记。在这些情况下,标签的可靠性是至关重要的因素,因为贴错标签的物品可能会将错误的标签传播到很大一部分甚至整个数据集。本文旨在通过提出一种基于图(基于网络)的半监督学习方法来解决此问题,该方法专门设计用于处理带有错误标签样本的数据集。该方法使用具有竞争性和合作性行为的行走粒子团队,以在根据输入数据集构建的网络中传播标签。所提出的模型是受自然启发的,并且合并了一些功能以使其对大量错误标记的数据项具有鲁棒性。计算机仿真表明,在不同大小和平均节点度的网络中,存在不同百分比的错误标签数据的情况下,该方法的性能。重要的是,这些模拟揭示了标签错误的子集大小的临界点的存在,在该临界点以下,网络没有错误的标签污染,但在标签之上,标签错误的样本开始将其标签传播到网络的其余部分。此外,已经在拟议的方法和使用人工和真实数据集的其他代表性的基于图的半监督学习方法之间进行了数值比较。有趣的是,随着标签错误样本的比例越来越大,所提出的方法比其他方法具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号