...
首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Radial-Based Oversampling for Multiclass Imbalanced Data Classification
【24h】

Radial-Based Oversampling for Multiclass Imbalanced Data Classification

机译:基于径向的多放样,用于多种数据分类数据分类

获取原文
获取原文并翻译 | 示例
           

摘要

Learning from imbalanced data is among the most popular topics in the contemporary machine learning. However, the vast majority of attention in this field is given to binary problems, while their much more difficult multiclass counterparts are relatively unexplored. Handling data sets with multiple skewed classes poses various challenges and calls for a better understanding of the relationship among classes. In this paper, we propose multiclass radial-based oversampling (MC-RBO), a novel data-sampling algorithm dedicated to multiclass problems. The main novelty of our method lies in using potential functions for generating artificial instances. We take into account information coming from all of the classes, contrary to existing multiclass oversampling approaches that use only minority class characteristics. The process of artificial instance generation is guided by exploring areas where the value of the mutual class distribution is very small. This way, we ensure a smart oversampling procedure that can cope with difficult data distributions and alleviate the shortcomings of existing methods. The usefulness of the MC-RBO algorithm is evaluated on the basis of extensive experimental study and backed-up with a thorough statistical analysis. Obtained results show that by taking into account information coming from all of the classes and conducting a smart oversampling, we can significantly improve the process of learning from multiclass imbalanced data.
机译:从不平衡数据学习是当代机器学习中最受欢迎的主题之一。然而,在这一领域的绝大多数注意力给出了二元问题,而他们更加困难的多种多数对应物相对未探索。处理具有多个偏斜类的数据集构成各种挑战,并呼叫更好地了解类之间的关系。在本文中,我们提出了多种基于径向的过采样(MC-RBO),这是一种专用于多字符问题的新型数据采样算法。我们的方法的主要新颖性在于使用潜在的功能来产生人工实例。我们考虑到来自所有课程的信息,违反仅使用少数群体特征的多种多联的多条空缺方法。通过探索相互类分布值非常小的区域来指导人工实例生成的过程。这样,我们确保可以应对困难数据分布的智能过采样程序,并减轻现有方法的缺点。基于广泛的实验研究和备用对MC-RBO算法的有用性进行了彻底的统计分析。获得的结果表明,通过考虑来自所有课程的信息并进行智能过采样,我们可以显着提高从多种多组不平衡数据学习的过程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号