首页> 外文会议>International Conference on Advanced Data Mining and Applications >A Normal Distribution-Based Over-Sampling Approach to Imbalanced Data Classification
【24h】

A Normal Distribution-Based Over-Sampling Approach to Imbalanced Data Classification

机译:基于正常的分布式的超抽样方法,以实现数据分类

获取原文

摘要

This study proposes a normal distribution-based over-sampling approach to balance the number of instances belonging to different classes in a data set. The balanced training data are used to learn unbiased classifiers for the original data set. Under some conditions, the proposed over-sampling approach generates samples with expected mean and variance similar to that of the original minority class data. As the approach tries to generate synthetic data with similar probability distributions to the original data, and expands the class boundaries of the minority class, it may increase the minority class classification performance. Experimental results show that the proposed approach outperforms alternative methods on benchmark data sets most of the times when implementing several classical classification algorithms.
机译:本研究提出了一种基于正常的分布的过采样方法来平衡属于数据集中不同类的实例数。平衡训练数据用于学习原始数据集的非偏见分类器。在某些条件下,所提出的过采样方法会产生具有与原始少数群体数据类似的预期平均值和方差的样本。由于该方法尝试生成具有与原始数据类似的概率分布的合成数据,并且扩展少数类类的类边界,可能会增加少数类别分类性能。实验结果表明,在实现几种经典分类算法时,所提出的方法占据了基准数据的替代方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号