...
首页> 外文期刊>Information Sciences: An International Journal >Kernel density estimation based sampling for imbalanced class distribution
【24h】

Kernel density estimation based sampling for imbalanced class distribution

机译:基于内核密度估计的基于模拟的非衡分比类分布

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Imbalanced response variable distribution is a common occurrence in data science. In fields such as fraud detection, medical diagnostics, system intrusion detection and many others where abnormal behavior is rarely observed the data under study often features disproportionate target class distribution. One common way to combat class imbalance is through resampling of the minority class to achieve a more balanced distribution. In this paper, we investigate the performance of the sampling method based on kernel density estimation (KDE). We believe that KDE offers a more natural way to generate new instances of minority class that is less prone to overfitting than other standard sampling techniques. It is based on a well established theory of nonparametric statistical estimation. Numerical experiments show that KDE can outperform other sampling techniques on a range of real life datasets as measured by F1-score and G-mean. The results remain consistent across a number of classification algorithms used in the experiments. Furthermore, the proposed method outperforms the benchmark methods irregardless of the class distribution ratio. We conclude, based on the solid theoretical foundation and strong experimental results, that the proposed method would be a valuable tool in problems involving imbalanced class distribution. (C) 2019 Elsevier Inc. All rights reserved.
机译:不平衡的响应变量分布是数据科学的常见情况。在诸如欺诈检测,医学诊断,系统入侵检测等中的领域,许多异常行为很少观察到研究下的数据往往具有不成比例的目标类分布。打击类别不平衡的一种常见方法是通过重新采样来实现更平衡的分布。在本文中,我们研究了基于核密度估计(KDE)的采样方法的性能。我们认为KDE提供了一种更自然的方式来产生比其他标准采样技术不太容易出现的少数群体类的新实例。它是基于熟悉的非参数统计估计理论。数值实验表明,KDE可以在通过F1分数和G均值测量的一系列现实生活数据集中优于其他采样技术。结果仍然一致在实验中使用的许多分类算法。此外,所提出的方法优于基准方法,不阶段分配比率。我们基于扎实的理论基础和强大的实验结果,所提出的方法是一个有价值的工具,涉及涉及不平衡的阶级分布。 (c)2019 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号