...
首页> 外文期刊>Scientific programming >A Decoupling and Bidirectional Resampling Method for Multilabel Classification of Imbalanced Data with Label Concurrence
【24h】

A Decoupling and Bidirectional Resampling Method for Multilabel Classification of Imbalanced Data with Label Concurrence

机译:用于多标签分类的解耦和双向重采样方法,具有标签并发的不平衡数据

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Label imbalance is one of the characteristics of multilabel data, and imbalanced data seriously affects the performance of the classifiers. In multilabel classification, resampling methods are mostly used to deal with imbalanced problems. Existing resampling methods balance the data by either undersampling or oversampling, which causes overfitting and information loss. Resampling has a significant impact on the minority labels. Furthermore, the high concurrency of majority labels and minority labels in many instances also affects the performance of classification. In this study, we proposed a bidirectional resampling method to decouple multilabel datasets. On one hand, the concurrency of labels can be reduced by setting termination conditions for decoupling, and on the other hand, the loss of instance information and overfitting can be alleviated by combining oversampling and undersampling. By measuring the minority labels of the instances, the instances that have less impact on minority labels are selected to resample. The number of resampling is limited to keep the original distribution of the data during the resampling phase. The experiments on seven benchmark multilabel datasets have proved the effectiveness of the algorithm, especially on datasets with high concurrency of majority labels and minority labels.
机译:标签不平衡是Multilabel数据的特征之一,并且不平衡数据严重影响了分类器的性能。在Multilabel分类中,重采样方法主要用于处理不平衡的问题。现有的重采样方法通过欠采样或过采样进行平衡,这导致过度拟合和信息丢失。重采样对少数民族标签产生重大影响。此外,许多情况下多数标签和少数民族标签的高并发性也会影响分类的性能。在这项研究中,我们提出了一种双向重采样方法来解除多议标签数据集。一方面,通过设置去耦的终端条件,可以减少标签的并发性,另一方面,通过组合过采样和欠采样来缓解实例信息和过度装备的丢失。通过测量实例的少数群体标签,选择对少数群体标签影响的实例进行重新取样。重采样次数限制为保持在重新采样阶段期间数据的原始分布。七个基准Multilabel数据集的实验证明了算法的有效性,尤其是大多数标签和少数民族标签并发的数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号