首页> 外文期刊>Mathematical Problems in Engineering >A Novel SMOTE-Based Classification Approach to Online Data Imbalance Problem
【24h】

A Novel SMOTE-Based Classification Approach to Online Data Imbalance Problem

机译:基于SMOTE的在线数据不平衡问题的新分类方法

获取原文
获取原文并翻译 | 示例

摘要

In many practical engineering applications, data are usually collected in online pattern. However, if the classes of these data are severely imbalanced, the classification performance will be restricted. In this paper, a novel classification approach is proposed to solve the online data imbalance problem by integrating a fast and efficient learning algorithm, that is, Extreme Learning Machine (ELM), and a typical sampling strategy, that is, the synthetic minority oversampling technique (SMOTE). To reduce the severe imbalance, the granulation division for major-class samples is made according to the samples' distribution characteristic, and the original samples are replaced by the obtained granule core to prepare a balanced sample set. In online stage, we firstly make granulation division for minor-class and then conduct oversampling using SMOTE in the region around granule core and granule border. Therefore, the training sample set is gradually balanced and the online ELM model is dynamically updated. We also theoretically introduce fuzzy information entropy to prove that the proposed approach has the lower bound of model reliability after undersampling. Numerical experiments are conducted on two different kinds of datasets, and the results demonstrate that the proposed approach outperforms some state-of-the-art methods in terms of the generalization performance and numerical stability.
机译:在许多实际工程应用中,数据通常以在线模式收集。但是,如果这些数据的类别严重不平衡,则分类性能将受到限制。本文提出了一种新颖的分类方法,该方法通过集成快速高效的学习算法(即极限学习机(ELM))和一种典型的采样策略(即综合少数群体过采样技术)来解决在线数据不平衡问题。 (SMOTE)。为了减少严重的不平衡,根据样本的分布特征对主要类别的样本进行造粒划分,将原始样本替换为获得的颗粒核,以制备均衡的样本集。在在线阶段,我们首先对小类进行制粒,然后在颗粒核心和颗粒边界附近使用SMOTE进行过采样。因此,训练样本集将逐渐平衡,并且在线ELM模型会动态更新。我们还从理论上引入了模糊信息熵,以证明该方法在欠采样后具有模型可靠性的下限。在两种不同类型的数据集上进行了数值实验,结果表明,在泛化性能和数值稳定性方面,该方法优于某些最新方法。

著录项

  • 来源
    《Mathematical Problems in Engineering》 |2016年第5期|5685970.1-5685970.14|共14页
  • 作者

    Gong Chunlin; Gu Liangxian;

  • 作者单位

    Northwestern Polytech Univ, Youyi West Rd 127, Xian 710072, Peoples R China|Natl Aerosp Flight Dynam Key Lab, Youyi West Rd 127, Xian 710072, Peoples R China;

    Northwestern Polytech Univ, Youyi West Rd 127, Xian 710072, Peoples R China|Natl Aerosp Flight Dynam Key Lab, Youyi West Rd 127, Xian 710072, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号