首页> 中文期刊> 《交通科学与工程》 >基于交叉组合重采样的拥挤识别方法

基于交叉组合重采样的拥挤识别方法

         

摘要

针对拥挤数据分布不平衡问题,提出了一种新的重采样方法———交叉组合重采样法。该方法是将随机向下采样法与 smote法相结合,对原始数据进行交叉采样,以减少采样法对原始数据的非均匀性破坏。通过仿真,得到比例为1∶10.1的非拥挤数据和拥挤数据原始样本。根据实际情况,通过交叉采样法,分别得到类比例为1∶5,1∶3以及1∶1的数据集,并对3种情况下的分类结果进行对比分析。选择朴素贝叶斯分类器、贝叶斯网络分类器及神经网络分类器,在不同比例数据集下,针对交叉组合重采样法和一般组合重采样法进行对比实验。实验结果证明:交叉组合重采样法能够更好地解决拥挤数据不平衡给分类器带来的问题。%A new re-sampling method is paccording to the problems of crowded data dis-tribution imbalance-cross combinations resample method,which combines random sam-pling method downwards and smote method.The cross-sampling method is taken to deal with the original data and the damage of the original data caused by sampling meth-od is reduced in homogeneity.Non-crowding and congestion data sample data with the ratio of approximately 1∶10.1 is obtained through simulation.According to the actual situation,the data with the ratio of 1∶5 ,1∶3 and 1∶1 could be received with the meth-od of cross combinations resample,and the classification results are compared and ana-lyzed in these three cases.Finally,cross combinations resample method and common combinations resample method are compared in the case of different ratios with the naive Bayes classifier,and bayesian network classifiers and neural network classifiers are done.Through experimental verification,it is proved that the cross combinations resam-ple method could better solve the congestion data imbalance problem which brings to the classifier.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号