首页> 外文会议>International Conference on Computer Communication and Informatics >A Boosting based Adaptive Oversampling Technique for Treatment of Class Imbalance
【24h】

A Boosting based Adaptive Oversampling Technique for Treatment of Class Imbalance

机译:一种基于促进基于适应性过采样技术,用于治疗类别不平衡

获取原文

摘要

The topic of class imbalance and its consequences have steered up the field of research for quite a long, as they bring pivotal impact over real-life scenarios such as medical disease diagnosis, fraud detection, etc. The typical solutions include data-level (undersampling or oversampling) or algorithmic-level (cost-sensitive learning) approaches. Synthetic Minority Oversampling Technique (SMOTE) has been acknowledged as one of the most effective data level solutions, but often suffers from the drawback of overfitting due to uniform oversampling rate. The ensemble learning techniques have recently emerged as effective; but can yield best results when integrated with data level solutions. In this work, a Boosting based oversampling technique is introduced with a customized oversampling rate, within an ensemble framework through cost-sensitive error formulation. The oversampling rate is tailored by using Local Covariance Matrix (LCM), while AdaBoost ensemble model with C4.5 weak learner is implemented as the ensemble framework. The work is compared with six benchmark techniques, for seven binary datasets. The experimental results prove the efficiency of the proposed work in treatment of imbalanced data.
机译:类别不平衡的主题及其后果已经转向了相当长的研究领域,因为它们对现实生活场景带来关键影响,如医疗疾病诊断,欺诈检测等。典型的解决方案包括数据级(欠采样或过采样)或算法级(成本敏感的学习)方法。合成少数群体过采样技术(SMOTE)被认为是最有效的数据级别解决方案之一,但由于均匀的过采样率,通常存在过度装备的缺点。集合学习技术最近有效地出现;但是在与数据级别解决方案集成时可以产生最佳结果。在这项工作中,通过成本敏感的误差制定,以定制的过采样率引入了基于升高的过采样技术。通过使用当地协方差矩阵(LCM)量身定制的过采样率,而Adaboost集合模型具有C4.5弱学习者的型号将实现为集合框架。该工作与六个基准技术进行比较,适用于七个二进制数据集。实验结果证明了所提出的工作效率治疗不平衡数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号