...
【24h】

A dissimilarity-based imbalance data classification algorithm

机译:基于差异的不平衡数据分类算法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Class imbalances have been reported to compromise the performance of most standard classifiers, such as Naive Bayes, Decision Trees and Neural Networks. Aiming to solve this problem, various solutions have been explored mainly via balancing the skewed class distribution or improving the existing classification algorithms. However, these methods pay more attention on the imbalance distribution, ignoring the discriminative ability of features in the context of class imbalance data. In this perspective, a dissimilarity-based method is proposed to deal with the classification of imbalanced data. Our proposed method first removes the useless and redundant features by feature selection from the given data set; and then, extracts representative instances from the reduced data as prototypes; finally, projects the reduced data into a dissimilarity space by constructing new features, and builds the classification model with data in the dissimilarity space. Extensive experiments over 24 benchmark class imbalance data sets show that, compared with seven other imbalance data tackling solutions, our proposed method greatly improves the performance of imbalance learning, and outperforms the other solutions with all given classification algorithms.
机译:据报道,类别失衡会损害大多数标准分类器的性能,例如朴素贝叶斯,决策树和神经网络。为了解决这个问题,主要通过平衡偏斜的类分布或改进现有的分类算法来探索各种解决方案。但是,这些方法更加关注不平衡分布,而忽略了类不平衡数据上下文中特征的判别能力。从这个角度出发,提出了一种基于不相似度的方法来处理不平衡数据的分类。我们提出的方法首先通过从给定数据集中选择特征来去除无用和多余的特征;然后,从缩减后的数据中提取代表性实例作为原型;最后,通过构造新特征将缩小后的数据投影到相异空间中,并使用相异空间中的数据构建分类模型。对24个基准类不平衡数据集进行的大量实验表明,与其他七个不平衡数据处理解决方案相比,我们提出的方法大大提高了不平衡学习的性能,并且在所有给定的分类算法下均优于其他解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号