Deep Over-sampling Framework for Classifying Imbalanced Data

机译：深度过度采样框架，用于分类不平衡数据

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic over-sampling have had limited success with complex, structured data handled by deep learning models. In this paper, we propose Deep Over-sampling (DOS), a framework for extending the synthetic over-sampling method to the deep feature space acquired by a convolutional neural network (CNN). Its key feature is an explicit, supervised representation learning, for which the training data presents each raw input sample with a synthetic embedding target in the deep feature space, which is sampled from the linear subspace of in-class neighbors. We implement an iterative process of training the CNN and updating the targets, which induces smaller in-class variance among the embeddings, to increase the discriminative power of the deep representation. We present an empirical study using public bench-marks, which shows that the DOS framework not only counteracts class imbalance better than the existing method, but also improves the performance of the CNN in the standard, balanced settings.

机译：类别失衡是深度学习模型的实际分类问题以及传统模式的具有挑战性问题。传统上成功的对策，例如合成过采样的成功与深层学习模型处理的复杂，结构化数据有限。在本文中，我们提出了深度过度采样（DOS），该框架将合成过采样方法扩展到由卷积神经网络（CNN）获取的深度特征空间。其关键特征是一种明确的监督表示学习，培训数据将每个原始输入样本呈现具有在深度特征空间中的合成嵌入目标的原始输入样本，该目标是从课堂内邻居的线性子空间中采样。我们实施培训CNN的迭代过程并更新目标，这在嵌入中诱导较小的课堂方差，以增加深度表示的辨别力。我们使用公共板块标记提出了一个实证研究，这表明DOS框架不仅比现有方法更好地抵消类别不平衡，而且还可以提高CNN在标准平衡设置中的性能。

著录项

来源
《European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases》|2017年|852p|共16页
会议地点
作者
Shin Ando; Chun Yuan Huang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13-53;
关键词
Class imbalance; Convolutional neural network Deep learning; Representation learning; Synthetic over-sampling;

机译：班级不平衡;卷积神经网络深度学习;代表学习;合成过采样;

相似文献

外文文献
中文文献
专利

1. An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data [J] . Ming Hao, Yanli Wang, Stephen H. Bryant Analytica chimica acta . 2014,第Null期

机译：一种有效的算法，结合合成少数过采样技术，对不平衡的PubChem BioAssay数据进行分类
2. PWIDB: A framework for learning to classify imbalanced data streams with incremental data re-balancing technique [J] . Rafiq Ahmed Mohammed, Kok-Wai Wong, Mohd Fairuz Shiratuddin, Procedia Computer Science . 2020,第5期

机译：PWIDB：使用增量数据重新平衡技术进行分类的框架，用于分类Imbalanced数据流
3. K-Neighbor over-sampling with cleaning data: a new approach to improve classification performance in data sets with class imbalance [J] . Budi Santoso, Hari Wijayanto, Khairil Anwar Notodiputro, Applied mathematical sciences . 2018,第9a12期

机译：使用清洗数据进行K邻域过度采样：一种新方法，可在具有类不平衡的数据集中提高分类性能
4. Deep Over-sampling Framework for Classifying Imbalanced Data [C] . Shin Ando, Chun Yuan Huang European conference on machine learning and principles and practice of knowledge discovery in databases . 2017

机译：深度过采样框架，用于对不平衡数据进行分类
5. Diversified ensemble classifiers for highly imbalanced data learning and its application in bioinformatics. [D] . Ding, Zejin. 2011

机译：用于高度不平衡数据学习的多元化集成分类器及其在生物信息学中的应用。
6. An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data [O] . Ming Hao, Yanli Wang, Stephen H. Bryant -1

机译：一种有效的算法结合合成少数过采样技术对不平衡的PubChem BioAssay数据进行分类
7. Deep Over-sampling Framework for Classifying Imbalanced Data [O] . Ando, Shin, Huang, Chun-Yuan 2017

机译：用于不平衡数据分类的深度过采样框架

Deep Over-sampling Framework for Classifying Imbalanced Data

摘要

著录项

相似文献

相关主题

期刊订阅