首页> 外文期刊>Connection Science >Learning in presence of class imbalance and class overlapping by using one-class SVM and undersampling technique
【24h】

Learning in presence of class imbalance and class overlapping by using one-class SVM and undersampling technique

机译:通过使用单级SVM和欠采样技术在存在类别的不平衡和类重叠的情况下学习

获取原文
获取原文并翻译 | 示例
           

摘要

The class imbalance problem engraves the traditional learning models by degrading performance and yielding erroneous outcomes. It is the scenario where one of the class representation is over-shadowed by other classes in a data space. Presence of class imbalance can cause a grave difficulty as misclassification cost of minority class tends to be very high. Presence of overlapping cases along with the case of imbalanced data, can lead to create grim situation for effective learning. In this study, an in-depth analysis of the effects of class imbalance and class overlapping in conventional learning models has been presented. A data level approach is adapted with one-class SVM-based anomaly detection to detect the cases of data overlapping while an adapted Tomek-link undersampling algorithm is defined to treat both overlapped and imbalanced cases. The proposed model evolves to eliminate borderline, redundant and overlapping cases with the account of Tomek-link pair and sparse neighbourhood. The proposed method has been evaluated with six state-of-the-art models for seven binary and two multiclass datasets, with respect to three standard learning models. The proposed model has been evaluated with cost-sensitive learning and extreme learning based approaches for imbalanced class learning. The proficiency of the proposed method over state-of-the-art models is established through experimental analyses.
机译:班级不平衡问题通过降级性能并产生错误的结果来雕刻传统的学习模型。它是一个类别表示由数据空间中的其他类过阴​​影的场景。类别不平衡的存在可能导致严重困难,因为少数群体的错误分类成本往往很高。在不平衡数据的情况下,存在重叠案例,可能导致创造严峻的情况以实现有效的学习。在这项研究中,已经提出了对传统学习模型中的类别不平衡和类重叠的效果的深入分析。数据级别方法适用于基于单级SVM的异常检测,以检测数据重叠的情况,而调整的Tomek-Link UnderAping算法被定义为处理重叠和不平衡的情况。拟议的模型演变为消除Tomek-Link对和稀疏邻域的界限,冗余和重叠案例。所提出的方法已经用六个二进制和两个多字符数据集的六种最先进的模型进行了评估,相对于三种标准学习模型。已经用成本敏感的学习和基于极端的学习的课程学习方法评估了所提出的模型。通过实验分析建立了拟议方法的熟练程度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号