...
首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Oversampling the Minority Class in the Feature Space
【24h】

Oversampling the Minority Class in the Feature Space

机译:对特征空间中的少数群体进行过采样

获取原文
获取原文并翻译 | 示例

摘要

The imbalanced nature of some real-world data is one of the current challenges for machine learning researchers. One common approach oversamples the minority class through convex combination of its patterns. We explore the general idea of synthetic oversampling in the feature space induced by a kernel function (as opposed to input space). If the kernel function matches the underlying problem, the classes will be linearly separable and synthetically generated patterns will lie on the minority class region. Since the feature space is not directly accessible, we use the empirical feature space (EFS) (a Euclidean space isomorphic to the feature space) for oversampling purposes. The proposed method is framed in the context of support vector machines, where the imbalanced data sets can pose a serious hindrance. The idea is investigated in three scenarios: 1) oversampling in the full and reduced-rank EFSs; 2) a kernel learning technique maximizing the data class separation to study the influence of the feature space structure (implicitly defined by the kernel function); and 3) a unified framework for preferential oversampling that spans some of the previous approaches in the literature. We support our investigation with extensive experiments over 50 imbalanced data sets.
机译:某些现实世界数据的不平衡特性是机器学习研究人员当前面临的挑战之一。一种常见的方法是通过凸组合少数群体的模式对少数群体进行过采样。我们探索由核函数(相对于输入空间)引起的特征空间中合成过采样的一般思想。如果内核函数匹配基本问题,则这些类将是线性可分离的,并且合成生成的模式将位于少数类区域。由于不能直接访问特征空间,因此我们将经验特征空间(EFS)(特征空间的同构欧氏空间)用于过采样。所提出的方法是在支持向量机的框架内构建的,其中不平衡的数据集会带来严重的障碍。该想法在三种情况下进行了研究:1)在完整和降级的EFS中进行过采样; 2)最大化数据类分离的内核学习技术,以研究特征空间结构的影响(由内核函数隐式定义); (3)优先过采样的统一框架,涵盖了文献中的某些先前方法。我们通过50多个不平衡数据集的广泛实验来支持我们的调查。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号