首页> 外文会议>Iberoamerican congress on pattern recognition >Managing Imbalanced Data Sets in Multi-label Problems: A Case Study with the SMOTE Algorithm
【24h】

Managing Imbalanced Data Sets in Multi-label Problems: A Case Study with the SMOTE Algorithm

机译:在多标签问题中管理不平衡数据集:以SMOTE算法为例

获取原文

摘要

Multi-label learning has been becoming an increasingly active area into the machine learning community since a wide variety of real world problems are naturally multi-labeled. However, it is not uncommon to find disparities among the number of samples of each class, which constitutes an additional challenge for the learning algorithm. Smote is an oversampling technique that has been successfully applied for balancing single-labeled data sets, but has not been used in multi-label frameworks so far. In this work, several strategies are proposed and compared in order to generate synthetic samples for balancing data sets in the training of multi-label algorithms. Results show that a correct selection of seed samples for oversampling improves the classification performance of multi-label algorithms. The uniform generation oversampling, provides an efficient methodology for a wide scope of real world problems.
机译:多标签学习已成为机器学习社区中一个越来越活跃的领域,因为各种各样的现实世界问题自然都是多标签的。然而,发现每个类别的样本数量之间的差异并不少见,这对学习算法构成了额外的挑战。 Smote是一种过采样技术,已成功应用于平衡单标签数据集,但到目前为止尚未在多标签框架中使用。在这项工作中,提出并比较了几种策略,以生成用于平衡多标签算法训练中的数据集的合成样本。结果表明,正确选择种子样本进行过采样可以提高多标签算法的分类性能。统一的代过采样为广泛的现实问题提供了一种有效的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号