首页> 外文学位 >A selective sampling method for imbalanced data learning on support vector machines.

【24h】

A selective sampling method for imbalanced data learning on support vector machines.

机译：一种在支持向量机上进行不平衡数据学习的选择性采样方法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The class imbalance problem in classification has been recognized as a significant research problem in recent years and a number of methods have been introduced to improve classification results. Rebalancing class distributions (such as over-sampling or under-sampling of learning datasets) has been popular due to its ease of implementation and relatively good performance. For the Support Vector Machine (SVM) classification algorithm, research efforts have focused on reducing the size of learning sets because of the algorithm's sensitivity to the size of the dataset. In this dissertation, we propose a metaheuristic approach (Genetic Algorithm) for under-sampling of an imbalanced dataset in the context of a SVM classifier. The goal of this approach is to find an optimal learning set from imbalanced datasets without empirical studies that are normally required to find an optimal class distribution. Experimental results using real datasets indicate that this metaheuristic under-sampling performed well in rebalancing class distributions. Furthermore, an iterative sampling methodology was used to produce smaller learning sets by removing redundant instances. It incorporates informative and the representative under-sampling mechanisms to speed up the learning procedure for imbalanced data learning with a SVM. When compared with existing rebalancing methods and the metaheuristic approach to under-sampling, this iterative methodology not only provides good performance but also enables a SVM classifier to learn using very small learning sets for imbalanced data learning. For large-scale imbalanced datasets, this methodology provides an efficient and effective solution for imbalanced data learning with an SVM.

机译：近年来，分类中的类不平衡问题已被认为是一个重要的研究问题，并且已经引入了许多方法来改善分类结果。重新平衡类分布（例如学习数据集的过采样或欠采样）由于其易于实施和相对良好的性能而受到欢迎。对于支持向量机（SVM）分类算法，由于算法对数据集大小的敏感性，研究工作集中在减小学习集的大小上。在本文中，我们提出了一种在支持向量机分类器的环境下对不平衡数据集进行欠采样的元启发式方法（遗传算法）。这种方法的目标是从不平衡的数据集中找到最佳的学习集，而无需进行通常需要找到最佳类别分布的经验研究。使用实际数据集的实验结果表明，这种元启发式欠采样在重新平衡类分布中表现良好。此外，通过删除冗余实例，使用迭代采样方法来生成较小的学习集。它结合了信息性和代表性的欠采样机制，可加快使用SVM进行不平衡数据学习的学习过程。与现有的重新平衡方法和元启发式方法进行欠采样的方法相比，这种迭代方法不仅可以提供良好的性能，而且还可以使SVM分类器使用非常小的学习集进行不平衡数据学习。对于大规模不平衡数据集，此方法为使用SVM进行不平衡数据学习提供了一种有效的解决方案。

著录项

作者
Choi, Jong Myong.;
展开▼
作者单位

Iowa State University.;

展开▼
授予单位 Iowa State University.;
学科 Statistics.;Computer Science.;Engineering Industrial.
学位 Ph.D.
年度 2010
页码 106 p.
总页数 106
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets [J] . Piri Saeed, Delen Dursun, Liu Tieming Decision support systems . 2018,第FEBa期

机译：利用支持向量机的综合信息性少数过度采样（SIMO）算法，可增强从不平衡数据集中的学习
2. A new sampling method for classifying imbalanced data based on support vector machine ensemble [J] . Jian Chuanxia, Gao Jian, Ao Yinhui Neurocomputing . 2016,第Juna12期

机译：基于支持向量机集成的不平衡数据分类新采样方法
3. The impact of certain methodological choices on multivariate analysis of fMRI data with support vector machines. [J] . Etzel JA, Valchev N, Keysers C NeuroImage . 2011,第2期

机译：某些方法选择对支持向量机对fMRI数据进行多变量分析的影响。
4. An Under-Sampling Method with Support Vectors in Multi-class Imbalanced Data Classification [C] . Md. Yasir Arafat, Sabera Hoque, Shuxiang Xu, International Conference on Software, Knowledge Information Management and Applications . 2019

机译：多类不平衡数据分类中带有支持向量的欠采样方法
5. Active learning with support vector machines for imbalanced datasets and a method for stopping active learning based on stabilizing predictions. [D] . Bloodgood, Michael. 2009

机译：支持向量机用于不平衡数据集的主动学习，以及一种基于稳定预测的主动学习停止方法。
6. Self-training in significance space of support vectors for imbalanced biomedical event data [O] . Tsendsuren Munkhdalai, Oyun-Erdene Namsrai, Keun Ho Ryu 2015

机译：不平衡生物医学事件数据在支持向量有效空间中的自训练
7. A Selective Sampling Method for Imbalanced Data Learning on Support Vector Machines [O] . Choi, Jong Myong 2010

机译：支持向量机上不平衡数据学习的一种选择性采样方法

A selective sampling method for imbalanced data learning on support vector machines.

摘要

著录项

相似文献

相关主题

期刊订阅