首页> 外文期刊>International journal of machine learning and cybernetics >Research on classification method of high-dimensional class-imbalanced datasets based on SVM
【24h】

Research on classification method of high-dimensional class-imbalanced datasets based on SVM

机译:基于支持向量机的高维类不平衡数据集分类方法研究

获取原文
获取原文并翻译 | 示例
           

摘要

High-dimensional problems result in bad classification results because some combinations of features have an adverse effect on classification; while class-imbalanced problems make the classifier to concern the majority class more but the minority less, because the number of samples of majority class is more than minority class. The problem of both high-dimensional and class-imbalanced classification is found in many fields such as bioinformatics, healthcare and so on. Many researchers study either the high-dimensional problem or class-imbalanced problem and come up with a series of algorithms, but they ignore the above new problem, which indicates high-dimensional problems affect sampling process while class-imbalanced problems interfere feature selection. Firstly, this paper analyses the new problem arising from the mutual influence of the two problems, and then introduces SVM and analyses its advantages in dealing high-dimensional problem and class-imbalanced problem. Next, this paper proposes a new algorithm named BRFE-PBKS-SVM aimed at high-dimensional class-imbalanced datasets, which improves SVM-RFE by considering the class-imbalanced problem in the process of feature selection, and it also improves SMOTE so that the procedure of over-sampling could work in the Hilbert space with an adaptive over-sampling rate by PSO. Finally, the experimental results show the performance of this algorithm.
机译:高维问题会导致不良的分类结果,因为某些特征组合会对分类产生不利影响。类别不平衡的问题使分类器对多数类别的关注更多,而对少数类别的关注较少,这是因为多数类别的样本数量多于少数类别。在生物信息学,医疗保健等许多领域都发现了高维分类不平衡的问题。许多研究人员研究了高维问题或类不平衡问题,并提出了一系列算法,但他们忽略了上述新问题,这表明高维问题会影响采样过程,而类不平衡问题会干扰特征选择。本文首先分析了这两个问题相互影响所产生的新问题,然后介绍了支持向量机,并分析了它在处理高维问题和类不平衡问题上的优势。接下来,针对高维类不平衡数据集,提出了一种名为BRFE-PBKS-SVM的新算法,该算法通过在特征选择过程中考虑类不平衡问题来改进SVM-RFE,并且还改进了SMOTE,从而过采样的过程可以在希尔伯特空间中以PSO自适应过采样率进行。最后,实验结果表明了该算法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号