Improving classification accuracy using data augmentation on small data sets

Moreno-Barea Francisco J.; Jerez Jose M.; Franco Leonardo

首页> 外文期刊>Expert Systems with Application >Improving classification accuracy using data augmentation on small data sets

【24h】

Improving classification accuracy using data augmentation on small data sets

机译：使用小数据集上的数据增强提高分类准确性

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Data augmentation (DA) is a key element in the success of Deep Learning (DL) models, as its use can lead to better prediction accuracy values when large size data sets are used. DA was not very much used with earlier neural network models before 2012, and the reason might be related to the type of models and the size of the data sets used. We investigate in this work, applying several state-of-the-art models based on Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), the effect of DA when using small size data sets, analyzing the results in terms of the prediction accuracy obtained according to the different characteristics of the training samples (number of instances and features, and class unbalance degree). We further introduce modifications to the standard methods used to generate the synthetic samples to alter the class balance representation, and the overall results indicate that with some computational effort a significant increase in prediction accuracy can be obtained when small data sets are considered. (C) 2020 Elsevier Ltd. All rights reserved.

机译：数据增强（DA）是深度学习（DL）模型成功的关键元素，因为当使用大尺寸数据集时，它的使用可能导致更好的预测精度值。 DA在2012年之前与早期的神经网络模型没有很多使用，并且原因可能与模型类型和所用数据集的大小相关。我们在这项工作中调查，应用基于变分的自动化器（VAES）和生成的对冲网络（GANS）的多种最先进的模型，在使用小尺寸数据集时DA的效果，从预测方面分析结果根据训练样本的不同特征获得的准确性（实例和特征的数量和类别不平衡程度）。我们进一步引入了用于生成合成样本的标准方法的修改以改变类平衡表示，并且整体结果表明，在考虑小数据集时，可以获得一些计算工作，可以获得预测精度的显着增加。（c）2020 elestvier有限公司保留所有权利。

著录项

来源
《Expert Systems with Application》 |2020年第12期|113696.1-113696.14|共14页
作者
Moreno-Barea Francisco J.; Jerez Jose M.; Franco Leonardo;
展开▼
作者单位

Univ Malaga Escuela Tecn Super Ingn Informat Dept Lenguajes & Ciencias Comp 35 Bulevar Louis Pasteur Malaga Spain;

Univ Malaga Escuela Tecn Super Ingn Informat Dept Lenguajes & Ciencias Comp 35 Bulevar Louis Pasteur Malaga Spain;

Univ Malaga Escuela Tecn Super Ingn Informat Dept Lenguajes & Ciencias Comp 35 Bulevar Louis Pasteur Malaga Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Deep Learning; Data augmentation; GAN; VAE; Unbalanced sets;

机译：深度学习;数据增强;GaN;VAE;不平衡的套装;

相似文献

外文文献
中文文献
专利

1. A Hybrid Data Mining Technique for Improving the Classification Accuracy of Microarray Data Set [J] . Sujata Dash, Bichitrananda Patra, B.K. Tripathy International Journal of Information Engineering and Electronic Business . 2012,第2期

机译：一种提高微阵列数据集分类精度的混合数据挖掘技术
2. Neural Techniques for Improving the Classification Accuracy of Microarray Data Set using Rough Set Feature Selection Method [J] . Bichitrananda Patra, Sujata Dash, B. K. Tripathy International Journal of Computer Trends and Technology . 2013,第3期

机译：粗糙集特征选择方法提高微阵列数据分类精度的神经技术
3. Improving Classification Accuracy of IC Packaging Products Database Based on Variable Precision Rough Sets [J] . Yung-Hsiang Hung Information Technology Journal . 2008,第3期

机译：基于可变精度粗糙集的集成电路封装产品数据库分类精度的提高
4. Improving land-cover classification accuracy with a patch-based convolutional neural network: data augmentation and purposive sampling [C] . Hunsoo Song, Yongil Kim Joint Urban Remote Sensing Event . 2019

机译：使用基于补丁的卷积神经网络提高土地覆盖分类的准确性：数据扩充和目标抽样
5. An integrated atmospheric correction and classification system for remote sensing data to improve correction and classification accuracy. [D] . Mohamed el Mahboub, Widad Ibrahim. 2000

机译：一个集成的大气校正和分类系统，用于遥感数据，以提高校正和分类的准确性。
6. Improving the prediction accuracy in classification using the combined data sets by ranks of gene expressions [O] . Ki-Yeol Kim, Dong Hyuk Ki, Hei-Cheul Jeung, 2008

机译：使用按基因表达等级组合的数据集提高分类的预测准确性
7. Table 4: Data augmentation accuracy comparisons () in different sizes of datasets (N) using ResNet50 on Intel Image Classification Dataset. [O] . -1

机译：表4：使用Reset50在Intel Image Classification数据集上使用Reset50的不同大小的数据增强精度比较（％）。

Improving classification accuracy using data augmentation on small data sets

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅