Oversampling method using outlier detectable generative adversarial network

Oh Joo-Hyuk; Hong Jae Yeol; Baek Jun-Geol

首页> 外文期刊>Expert Systems with Application >Oversampling method using outlier detectable generative adversarial network

【24h】

Oversampling method using outlier detectable generative adversarial network

机译：使用离群可检测的生成对抗网络的过采样方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A class imbalance problem occurs when a particular class of data is significantly more or less than another class of data. This problem is difficult to solve; however, solutions such as the oversampling method using synthetic minority oversampling technique (SMOTE) or conditional generative adversarial network (cGAN) have been suggested recently to solve this problem. In the case of SMOTE and their variations, it is possible to generate biased artificial data because it does not consider the entire data in the minority class. To overcome this problem, an oversampling method using cGAN has been proposed. However, such a method does not consider the majority class that affects the classification boundary. In particular, if there is an outlier in the majority class, the classification boundary may be biased. This paper presents an oversampling method using outlier detectable generative adversarial network (OD-GAN) to solve this problem. We use a discriminator, which is used only for training purposes in cGAN, as an outlier detector to quantify the difference between the distributions of the majority and minority classes. The discriminator can detect and remove outliers. This prevents the distortion of the classification boundary caused by outliers. The generator imitates the distribution of the minority class and generates artificial data to balance the dataset. We experiment with various datasets, oversampling techniques, and classifiers. The empirical results show that the performance of OD-GAN is better than those of other oversampling methods for imbalanced datasets with outliers. (C) 2019 Elsevier Ltd. All rights reserved.

机译：当特定类别的数据明显大于或小于另一类别的数据时，就会发生类别不平衡问题。这个问题很难解决。然而，最近提出了诸如使用合成少数过采样技术（SMOTE）或条件生成对抗网络（cGAN）的过采样方法的解决方案来解决该问题。对于SMOTE及其变体，可能会生成有偏差的人工数据，因为它没有考虑少数类中的全部数据。为了克服这个问题，已经提出了使用cGAN的过采样方法。但是，这种方法不考虑影响分类边界的多数类。特别是，如果多数类别中存在离群值，则分类边界可能会存在偏差。本文提出了一种使用异常可检测的生成对抗网络（OD-GAN）的过采样方法来解决此问题。我们使用一个仅在cGAN中用于训练目的的鉴别器作为离群值检测器，以量化多数和少数族裔分布之间的差异。鉴别器可以检测并去除异常值。这防止了由异常值引起的分类边界的失真。生成器模仿少数类的分布，并生成人工数据以平衡数据集。我们尝试了各种数据集，过采样技术和分类器。实证结果表明，对于具有异常值的不平衡数据集，OD-GAN的性能优于其他过采样方法。（C）2019 Elsevier Ltd.保留所有权利。

著录项

来源
《Expert Systems with Application》 |2019年第11期|1-8|共8页
作者
Oh Joo-Hyuk; Hong Jae Yeol; Baek Jun-Geol;
展开▼
作者单位

Korea Univ, Sch Ind Management Engn, 145 Anam Ro, Seoul 02841, South Korea;

Korea Univ, Sch Ind Management Engn, 145 Anam Ro, Seoul 02841, South Korea;

Korea Univ, Sch Ind Management Engn, 145 Anam Ro, Seoul 02841, South Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Class imbalance problem; Oversampling; Generative adversarial network; Outlier detection;

机译：类不平衡问题;过采样;生成对抗网络;离群值检测;

相似文献

外文文献
中文文献
专利

1. Oversampling method using outlier detectable generative adversarial network [J] . Oh Joo-Hyuk, Hong Jae Yeol, Baek Jun-Geol Expert systems with applications . 2019,第Nova期

机译：超采样方法使用异常可检测生成的对抗网络
2. Generative Adversarial Networks and Markov Random Fields for oversampling very small training sets [J] . Salazar Addisson, Vergara Luis, Safont Gonzalo Expert systems with applications . 2021,第Jana期

机译：用于过采样非常小的训练集的生成对抗网络和马尔可夫随机字段
3. Oversampling Log Messages using A Sequence Generative Adversarial Network for Anomaly Detection and Classification [J] . Amir Farzad, T. Aaron Gulliver Computer Science & Information Technology . 2020,第5期

机译：使用序列生成的对冲网络用于异常检测和分类的过采样日志消息
4. New applications of an oversampling method based on generative adversarial networks [C] . Addisson Salazar, Luis Vergara, Gonzalo Safont International Conference on Computational Science and Computational Intelligence . 2020

机译：基于生成对策网络的过采样方法的新应用
5. Improved Speech Enhancement Algorithm based on Generative Adversarial Networks [D] . Wang, Kebei. 2021

机译：基于生成对抗性网络的改进语音增强算法
6. Synthetic minority oversampling of vital statistics data with generative adversarial networks [O] . Aki Koivu, Mikko Sairanen, Antti Airola, 2020

机译：具有生成对抗网络的重要统计数据的合成少数民族空缺
7. Oversampling Log Messages using A Sequence Generative Adversarial Network for Anomaly Detection and Classification [O] . Amir Farzad, T. Aaron Gulliver 2020

机译：使用序列生成的对冲网络用于异常检测和分类的过采样日志消息

Oversampling method using outlier detectable generative adversarial network

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅