首页> 中文期刊> 《计算机工程与应用》 >一种适合不平衡数据集的新型提升算法

一种适合不平衡数据集的新型提升算法

         

摘要

提出了一种新的适用于不平衡数据集的Adaboost算法(ILAdaboost),该算法利用每一轮学习到的基分类器对原始数据集进行测试评估,并根据评估结果将原始数据集分成四个子集,然后在四个子集中重新采样形成平衡的数据集供下一轮基分类器学习,由于抽样过程中更加倾向于少数类和分错的多数类,故合成分类器的分界面会偏离少数类.该算法在UCI的10个典型不平衡数据集上进行实验,在保证多数类分类精度的同时提高了少数类的分类精度以及GMA.%A new training method of AdaBoost(ILAdaboost) which is good for unbalanced datasets is proposed in this paper. The algorithm evaluates the original data with the base classifier of each iteration.It divides the original dataset into four subsets, and then re-samples in the four subsets to form the balanced datasets, using for the base classifier learning in the next iteration.Due to the inclination to the minority and the false classified majority in the process of re-sampling, the decision surface in using synthetic classifier deviates from the minority.Based on the experiment of the 10 classical unbalanced datasets from UCI,the algorithm greatly increases the accuracy of minority and the GMA,keeping the accuracy of majority.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号