首页> 外文会议>IEEE International Conference on Machine Learning and Applications >ABC-sampling for Balancing Imbalanced Datasets Based on Artificial Bee Colony Algorithm
【24h】

ABC-sampling for Balancing Imbalanced Datasets Based on Artificial Bee Colony Algorithm

机译:基于人工蜂群算法的平衡不平衡数据集的ABC采样

获取原文

摘要

Class imbalanced data is a common problem for predictive modelling in domains such as bioinformatics. It occurs when the distribution of classes is not uniform among samples and results in a biased prediction of learning towards majority classes. In this study, we propose the ABC-Sampling algorithm based on a swarm optimization method called Artificial Bee Colony, which models the natural foraging behaviour of honeybees. Our algorithm lessens the effects of imbalanced classes by selecting the most informative majority samples using a forward search and storing them in a ranked subset. Then we construct a balanced dataset with a planned undersampling strategy to extract the most frequent majority samples from the top ranked subset and combine them with all minority samples. Our algorithm is superior to a state-of-the-art method on nine benchmark datasets with various levels of imbalance ratios.
机译:对于诸如生物信息学之类的领域中的预测建模,类不平衡数据是一个普遍的问题。当样本之间的类别分布不均匀时,会发生这种情况,并导致对多数类别学习的偏向预测。在这项研究中,我们提出了一种基于群体优化方法(称为人工蜂群)的ABC采样算法,该算法对蜜蜂的自然觅食行为进行了建模。我们的算法通过使用前向搜索选择信息量最大的多数样本并将其存储在排名子集中,从而减轻了不平衡类的影响。然后,我们使用计划的欠采样策略构造一个平衡的数据集,以从排名最高的子集中提取最频繁的多数样本,并将其与所有少数样本组合。我们的算法在具有不同水平的不平衡率的九个基准数据集上优于最新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号