Deep Over-sampling Framework for Classifying Imbalanced Data

机译：深度过采样框架，用于对不平衡数据进行分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic over-sampling have had limited success with complex, structured data handled by deep learning models. In this paper, we propose Deep Over-sampling (DOS), a framework for extending the synthetic over-sampling method to the deep feature space acquired by a convolutional neural network (CNN). Its key feature is an explicit, supervised representation learning, for which the training data presents each raw input sample with a synthetic embedding target in the deep feature space, which is sampled from the linear sub-space of in-class neighbors. We implement an iterative process of training the CNN and updating the targets, which induces smaller in-class variance among the embeddings, to increase the discriminative power of the deep representation. We present an empirical study using public benchmarks, which shows that the DOS framework not only counteracts class imbalance better than the existing method, but also improves the performance of the CNN in the standard, balanced settings.

机译：在深度学习模型和传统模型的实际分类问题中，类不平衡是一个具有挑战性的问题。传统上成功的对策（例如合成过采样）在深度学习模型处理的复杂，结构化数据方面取得的成功有限。在本文中，我们提出了深度过采样（DOS），一种框架，用于将综合过采样方法扩展到卷积神经网络（CNN）所获取的深层特征空间。它的关键特征是显式，有监督的表示学习，为此训练数据将每个原始输入样本呈现给具有深层特征空间的合成嵌入目标，该目标是从类内邻居的线性子空间中采样的。我们实施了训练CNN和更新目标的迭代过程，从而在嵌入之间产生较小的类内差异，从而提高了深度表示的判别力。我们使用公共基准进行了一项实证研究，该研究表明DOS框架不仅比现有方法更好地解决了类不平衡问题，而且还改善了标准，平衡设置下CNN的性能。

著录项

来源
《European conference on machine learning and principles and practice of knowledge discovery in databases》|2017年|770-785|共16页
会议地点
作者
Shin Ando; Chun Yuan Huang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Class imbalance; Convolutional neural network Deep learning; Representation learning; Synthetic over-sampling;

机译：阶级失衡;卷积神经网络深度学习表征学习;综合过采样;

相似文献

外文文献
中文文献
专利

1. An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data [J] . Ming Hao, Yanli Wang, Stephen H. Bryant Analytica chimica acta . 2014,第Null期

机译：一种有效的算法，结合合成少数过采样技术，对不平衡的PubChem BioAssay数据进行分类
2. PWIDB: A framework for learning to classify imbalanced data streams with incremental data re-balancing technique [J] . Rafiq Ahmed Mohammed, Kok-Wai Wong, Mohd Fairuz Shiratuddin, Procedia Computer Science . 2020,第5期

机译：PWIDB：使用增量数据重新平衡技术进行分类的框架，用于分类Imbalanced数据流
3. K-Neighbor over-sampling with cleaning data: a new approach to improve classification performance in data sets with class imbalance [J] . Budi Santoso, Hari Wijayanto, Khairil Anwar Notodiputro, Applied mathematical sciences . 2018,第9a12期

机译：使用清洗数据进行K邻域过度采样：一种新方法，可在具有类不平衡的数据集中提高分类性能
4. Deep Over-sampling Framework for Classifying Imbalanced Data [C] . Shin Ando, Chun Yuan Huang European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases . 2017

机译：深度过度采样框架，用于分类不平衡数据
5. Diversified ensemble classifiers for highly imbalanced data learning and its application in bioinformatics. [D] . Ding, Zejin. 2011

机译：用于高度不平衡数据学习的多元化集成分类器及其在生物信息学中的应用。
6. An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data [O] . Ming Hao, Yanli Wang, Stephen H. Bryant -1

机译：一种有效的算法结合合成少数过采样技术对不平衡的PubChem BioAssay数据进行分类
7. Deep Over-sampling Framework for Classifying Imbalanced Data [O] . Ando, Shin, Huang, Chun-Yuan 2017

机译：用于不平衡数据分类的深度过采样框架

Deep Over-sampling Framework for Classifying Imbalanced Data

摘要

著录项

相似文献

相关主题

期刊订阅