...
首页> 外文期刊>Knowledge-Based Systems >SMOTE-NaN-DE: Addressing the noisy and borderline examples problem in imbalanced classification by natural neighbors and differential evolution
【24h】

SMOTE-NaN-DE: Addressing the noisy and borderline examples problem in imbalanced classification by natural neighbors and differential evolution

机译:SMOTE-NAN-DE:通过自然邻居和差异演化来解决不平衡分类中的嘈杂和边界示例问题

获取原文
获取原文并翻译 | 示例
           

摘要

Learning a classifier from class-imbalance data is an important challenge. Among existing solutions, SMOTE is one of the most successful methods and has an extensive range of practical applications. The performance of SMOTE and its extensions usually degrades owing to noisy and borderline examples. Filtering-based methods have been developed to address this problem but still have the following technical defects: (a) Error detection techniques heavily rely on parameter settings; (b) Examples detected by error detection techniques are directly eliminated, leading to deviation of obtained decision boundary and class imbalance again. To advance the state of the art, a novel filtering based oversampling method called SMOTE-NaN-DE is proposed in this paper. In SMOTE-NaN-DE, a SMOTE-based method is first used to generate synthetic samples and improve original class-imbalance data. Secondly, an error detection technique based on natural neighbors is used to detect noisy and borderline examples. Thirdly, the differential evolution (DE) is used to optimize and change iteratively the position (attributes) of found examples instead of eliminating them. The main advantages of SMOTE-NaN-DE are that (a) It can improve almost all of SMOTE-based methods in terms of the noise problem; (b) Error detection technique is parameter-free; (c) Examples found by error detection technique are optimized by the differential evolution rather than removed, which keeps imbalance ratio and improve the boundary; (d) It is more suitable for data sets with more noise (especially class noise). The effectiveness of the proposed SMOTE-NaN-DE is validated by intensive comparison experiments on artificial and real data sets. (C) 2021 Elsevier B.V. All rights reserved.
机译:从类不平衡数据学习分类器是一个重要的挑战。在现有解决方案中,SMOTE是最成功的方法之一,具有广泛的实际应用。由于嘈杂和边界示例,Smote及其扩展的表现通常会降低。已开发过滤的方法来解决此问题,但仍具有以下技术缺陷:(a)错误检测技术严重依赖参数设置; (b)通过错误检测技术检测的示例直接被消除,导致获得的决策边界和阶级失衡再次偏离。为了推进现有技术,本文提出了一种称为Smote-NaN-de的新型过滤的过采样方法。在Smote-Nan-de中,首先使用基于粉的方法来产生合成样本并改善原始类别不平衡数据。其次,使用基于自然邻居的错误检测技术来检测嘈杂和边界示例。第三,差分演进(de)用于优化和改变迭代的位置(属性)而不是消除它们的位置(属性)。 Smote-Nan-de的主要优点是(a)它可以在噪声问题方面改善几乎所有基于麦克风的方法; (b)错误检测技术是无参数的; (c)通过误差检测技术发现的示例由差分演变而不是移除,这使得不平衡比率并改善边界; (d)更适合具有更多噪声的数据集(尤其是类噪声)。通过关于人工和真实数据集的密集比较实验,验证了拟议的Smote-NaN-DE的有效性。 (c)2021 elestvier b.v.保留所有权利。

著录项

  • 来源
    《Knowledge-Based Systems》 |2021年第8期|107056.1-107056.16|共16页
  • 作者单位

    Chongqing Univ Coll Comp Sci Chongqing Key Lab Software Theory & Technol Chongqing 400044 Peoples R China;

    Chongqing Univ Coll Comp Sci Chongqing Key Lab Software Theory & Technol Chongqing 400044 Peoples R China;

    Chongqing Univ Coll Comp Sci Chongqing Key Lab Software Theory & Technol Chongqing 400044 Peoples R China;

    Chongqing Univ Coll Comp Sci Chongqing Key Lab Software Theory & Technol Chongqing 400044 Peoples R China;

    Chongqing Univ Coll Comp Sci Chongqing Key Lab Software Theory & Technol Chongqing 400044 Peoples R China;

    Chongqing Univ Chongqing Key Lab Low Grade Energy Utilizat Techn Chongqing 400044 Peoples R China;

    Chinese Acad Sci Chongqing Inst Chongqing 400714 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Class-imbalance learning; Class-imbalance classification; Oversampling; Differential evolution; Natural neighbors;

    机译:类别 - 不平衡学习;类 - 不平衡分类;过采样;差异进化;自然邻居;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号