首页> 外文期刊>Knowledge and Information Systems >SMOTE-RSB *: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory
【24h】

SMOTE-RSB *: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory

机译:SMOTE-RSB * :一种基于过采样和欠采样的混合预处理方法,使用SMOTE和粗糙集理论处理高不平衡数据集

获取原文
获取原文并翻译 | 示例
           

摘要

Imbalanced data is a common problem in classification. This phenomenon is growing in importance since it appears in most real domains. It has special relevance to highly imbalanced data-sets (when the ratio between classes is high). Many techniques have been developed to tackle the problem of imbalanced training sets in supervised learning. Such techniques have been divided into two large groups: those at the algorithm level and those at the data level. Data level groups that have been emphasized are those that try to balance the training sets by reducing the larger class through the elimination of samples or increasing the smaller one by constructing new samples, known as undersampling and oversampling, respectively. This paper proposes a new hybrid method for preprocessing imbalanced data-sets through the construction of new samples, using the Synthetic Minority Oversampling Technique together with the application of an editing technique based on the Rough Set Theory and the lower approximation of a subset. The proposed method has been validated by an experimental study showing good results using C4.5 as the learning algorithm.
机译:数据不平衡是分类中的常见问题。由于这种现象出现在大多数实际领域中,因此其重要性越来越高。它与高度不平衡的数据集特别相关(当类之间的比率很高时)。已经开发出许多技术来解决监督学习中训练集不平衡的问题。此类技术已分为两大类:算法级别的技术和数据级别的技术。强调的数据级别组是那些试图通过消除样本减少较大的类别或通过构造新样本来增加较小的类别(分别称为欠采样和过采样)来平衡训练集的类别。本文提出了一种新的混合方法,该方法通过使用合成少数过采样技术以及基于粗糙集理论和子集的较低逼近度的编辑技术,通过构造新样本来预处理不平衡数据集。通过实验研究验证了该方法的有效性,该研究表明使用C4.5作为学习算法具有良好的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号