A systematic study of the class imbalance problem in convolutional neural networks

Buda Mateusz; Maki Atsuto; Mazurowski Maciej A.

首页> 外文期刊>Neural Networks: The Official Journal of the International Neural Network Society >A systematic study of the class imbalance problem in convolutional neural networks

【24h】

A systematic study of the class imbalance problem in convolutional neural networks

机译：卷积神经网络中班级不平衡问题的系统研究

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare frequently used methods to address the issue. Class imbalance is a common problem that has been comprehensively studied in classical machine learning, yet very limited systematic research is available in the context of deep learning. In our study, we use three benchmark datasets of increasing complexity, MNIST, CIFAR-10 and ImageNet, to investigate the effects of imbalance on classification and perform an extensive comparison of several methods to address the issue: oversampling, undersampling, two-phase training, and thresholding that compensates for prior class probabilities. Our main evaluation metric is area under the receiver operating characteristic curve (ROC AUC) adjusted to multi-class tasks since overall accuracy metric is associated with notable difficulties in the context of imbalanced data. Based on results from our experiments we conclude that (i) the effect of class imbalance on classification performance is detrimental; (ii) the method of addressing class imbalance that emerged as dominant in almost all analyzed scenarios was oversampling; (iii) oversampling should be applied to the level that completely eliminates the imbalance, whereas the optimal undersampling ratio depends on the extent of imbalance; (iv) as opposed to some classical machine learning models, oversampling does not cause overfitting of CNNs; (v) thresholding should be applied to compensate for prior class probabilities when overall number of properly classified cases is of interest. (c) 2018 Elsevier Ltd. All rights reserved.

机译：在本研究中，我们系统地调查了类别不平衡对卷积神经网络（CNNS）分类性能的影响，并比较常用方法来解决问题。类别不平衡是在古典机器学习中综合研究的常见问题，但在深度学习的背景下可以提供非常有限的系统研究。在我们的研究中，我们使用三个基准数据集增加了复杂性，Mnist，CiFar-10和Imagenet，调查了不平衡对分类的影响，并对几种方法进行了广泛的比较来解决问题：过采样，欠采样，两阶段培训和补偿现有类概率的阈值处理。我们的主要评估度量是在接收器操作特性曲线（ROC AUC）下调整到多级任务的区域，因为总体精度度量在不平衡数据的上下文中与显着困难相关联。根据我们实验的结果，我们得出结论（i）类别失衡对分类性能的影响是有害的; （ii）在几乎所有分析的情景中出现作为占主导地位的阶级不平衡的方法是过采样的; （iii）过采样应适用于完全消除不平衡的水平，而最佳的欠采样率取决于不平衡的程度; （iv）与某些经典机器学习模型相反，过采样不会导致CNNS过度使用; （v）当适当分类案件的总数是感兴趣的情况下，应申请阈值处理以补偿现有的概率。（c）2018年elestvier有限公司保留所有权利。

著录项

来源
《Neural Networks: The Official Journal of the International Neural Network Society》 |2018年第2018期|共11页
作者
Buda Mateusz; Maki Atsuto; Mazurowski Maciej A.;
展开▼
作者单位

Duke Univ Dept Radiol Sch Med Durham NC 27710 USA;

KTH Royal Inst Technol Sch Elect Engn &

Comp Sci Stockholm Sweden;

Duke Univ Dept Radiol Sch Med Durham NC 27710 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类神经病学;
关键词
Class imbalance; Convolutional neural networks; Deep learning; Image classification;

机译：班级不平衡;卷积神经网络;深入学习;图像分类;

相似文献

外文文献
中文文献
专利

1. A systematic study of the class imbalance problem in convolutional neural networks [J] . Buda Mateusz, Maki Atsuto, Mazurowski Maciej A. Neural Networks: The Official Journal of the International Neural Network Society . 2018,第期

机译：卷积神经网络中班级不平衡问题的系统研究
2. A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural networks [J] . G. Sambasivam, Geoffrey Duncan Opiyo Egyptian Informatics Journal . 2021,第1期

机译：采用卷积神经网络的预测机器学习申请：木薯病检测和对不平衡数据集的分类
3. Memory-Augmented Convolutional Neural Networks With Triplet Loss for Imbalanced Wafer Defect Pattern Classification [J] . Hyun Yunseung, Kim Heeyoung IEEE Transactions on Semiconductor Manufacturing . 2020,第4期

机译：内存增强的卷积神经网络，具有三态损耗的不平衡晶圆缺陷图案分类
4. Data Balanced Bagging Ensemble of Convolutional-LSTM Neural Networks for Time Series Data Classification with an Imbalanced Dataset [C] . Matthew Ward, Kevin Malmsten, Hassan Salamy, IEEE International Symposium on Circuits and Systems . 2021

机译：数据平衡堆装集合的卷积LSTM神经网络，用于时间序列数据分类，具有不平衡数据集
5. A Study of Convolutional Neural Networks: Classification Inspired by Biological Process [D] . Serna, Eugene. 2020

机译：卷积神经网络研究：生物过程启发的分类
6. Study on the Classification Performance of Underwater Sonar Image Classification Based on Convolutional Neural Networks for Detecting a Submerged Human Body [O] . Huu-Thu Nguyen, Eon-Ho Lee, Sejin Lee 2020

机译：基于卷积神经网络的水下声纳图像分类识别性能研究
7. A systematic study of the class imbalance problem in convolutional neural networks [O] . Buda, Mateusz 2017

机译：卷积神经网络中类不平衡问题的系统研究

A systematic study of the class imbalance problem in convolutional neural networks

摘要

著录项

相似文献

相关主题

期刊订阅