首页> 美国卫生研究院文献>Computational Intelligence and Neuroscience >Embedding Undersampling Rotation Forest for Imbalanced Problem
【2h】

Embedding Undersampling Rotation Forest for Imbalanced Problem

机译:嵌入欠采样旋转森林以解决不平衡问题

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Rotation Forest is an ensemble learning approach achieving better performance comparing to Bagging and Boosting through building accurate and diverse classifiers using rotated feature space. However, like other conventional classifiers, Rotation Forest does not work well on the imbalanced data which are characterized as having much less examples of one class (minority class) than the other (majority class), and the cost of misclassifying minority class examples is often much more expensive than the contrary cases. This paper proposes a novel method called Embedding Undersampling Rotation Forest (EURF) to handle this problem (1) sampling subsets from the majority class and learning a projection matrix from each subset and (2) obtaining training sets by projecting re-undersampling subsets of the original data set to new spaces defined by the matrices and constructing an individual classifier from each training set. For the first method, undersampling is to force the rotation matrix to better capture the features of the minority class without harming the diversity between individual classifiers. With respect to the second method, the undersampling technique aims to improve the performance of individual classifiers on the minority class. The experimental results show that EURF achieves significantly better performance comparing to other state-of-the-art methods.
机译:旋转森林是一种整体学习方法,通过使用旋转特征空间构建准确而多样的分类器,与装袋和提升相比,可以获得更好的性能。但是,与其他常规分类器一样,Rotation Forest无法在不平衡数据上很好地工作,因为不平衡数据的特征是一类(少数群体)的示例比另一类(少数群体)的示例少得多,并且对少数类示例进行错误分类的成本通常很高比相反的情况贵得多。本文提出了一种新颖的方法,称为嵌入欠采样旋转森林(EURF),以解决此问题(1)从多数类中采样子集并从每个子集中学习投影矩阵,以及(2)通过投影子集的重新欠采样子集获得训练集。将原始数据集移到由矩阵定义的新空间,并从每个训练集中构造一个单独的分类器。对于第一种方法,欠采样是强制旋转矩阵更好地捕获少数类别的特征,而不会损害各个分类器之间的多样性。关于第二种方法,欠采样技术旨在提高少数群体中各个分类器的性能。实验结果表明,与其他最新方法相比,EURF的性能明显更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号