首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >On the Dynamics of Classification Measures for Imbalanced and Streaming Data
【24h】

On the Dynamics of Classification Measures for Imbalanced and Streaming Data

机译:关于不平衡和流数据分类措施的动态

获取原文
获取原文并翻译 | 示例

摘要

As each imbalanced classification problem comes with its own set of challenges, the measure used to evaluate classifiers must be individually selected. To help researchers make this decision in an informed manner, experimental and theoretical investigations compare general properties of measures. However, existing studies do not analyze changes in measure behavior imposed by different imbalance ratios. Moreover, several characteristics of imbalanced data streams, such as the effect of dynamically changing class proportions, have not been thoroughly investigated from the perspective of different metrics. In this paper, we study measure dynamics by analyzing changes of measure values, distributions, and gradients with diverging class proportions. For this purpose, we visualize measure probability mass functions and gradients. In addition, we put forward a histogram-based normalization method that provides a unified, probabilistic interpretation of any measure over data sets with different class distributions. The results of analyzing eight popular classification measures show that the effect class proportions have on each measure is different and should be taken into account when evaluating classifiers. Apart from highlighting imbalance-related properties of each measure, our study shows a direct connection between class ratio changes and certain types of concept drift, which could be influential in designing new types of classifiers and drift detectors for imbalanced data streams.
机译:随着每个不平衡的分类问题都有自己的一系列挑战,必须单独选择用于评估分类器的措施。为了帮助研究人员以明智的方式做出这一决定,实验和理论调查比较措施的一般性。然而,现有研究不会分析不同不平衡比例施加的测量行为的变化。此外,从不同度量的角度彻底研究了不同数据流的若干特征,例如动态变化的类比例的效果。在本文中,我们通过分析测量值,分布和具有分歧的比例的梯度的变化来研究测量动态。为此,我们可视化测量概率质量函数和梯度。此外,我们提出了一种基于直方图的归一化方法,该方法提供了对具有不同类分布的数据集的任何测量的统一的概率解释。分析八个流行分类措施的结果表明,当评估分类器时,效果类比例对每个措施的比例不同,应当考虑。除了突出显示每种措施的不平衡相关性之外,我们的研究显示了类别比率变化和某些类型的概念漂移之间的直接连接,这可能在设计新型类型的分类器和漂移数据流方面有影响力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号