Applying Support Vector Machines to Imbalanced Datasets

机译：将支持向量机应用于不平衡数据集

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Support Vector Machines (SVM) have been extensively studied and have shown remarkable success in many applications. However the success of SVM is very limited when it is applied to the problem of learning from imbal-anced datasets in which negative instances heavily outnumber the positive instances (e.g. in gene profiling and detecting credit card fraud). This paper discusses the factors behind this failure and explains why the common strategy of undersampling the training data may not be the best choice for SVM. We then propose an algorithm for overcoming these problems which is based on a variant of the SMOTE algorithm by Chawla et al, combined with Veropoulos et al's different error costs algorithm. We compare the performance of our algorithm against these two algorithms, along with undersampling and regular SVM and show that our algorithm outperforms all of them.

机译：支持向量机（SVM）已被广泛研究，并在许多应用中显示出了惊人的成功。但是，当将SVM应用于从平衡均衡的数据集学习的问题时，其成功非常有限，在这些数据中，负面实例远远超过正面实例（例如在基因分析和检测信用卡欺诈中）。本文讨论了导致失败的原因，并解释了为什么对训练数据进行欠采样的通用策略可能不是SVM的最佳选择。然后，我们提出了一种克服这些问题的算法，该算法基于Chawla等人的SMOTE算法的一种变体，并结合了Veropoulos等人的不同错误成本算法。我们将我们的算法与这两种算法以及欠采样和常规SVM的性能进行了比较，结果表明我们的算法优于所有算法。

著录项

来源
《European Conference on Machine Learning(ECML 2004); 20040920-24; Pisa(IT)》|2004年|P.39-50|共12页
会议地点 Pisa(IT)
作者
Rehan Akbani; Stephen Kwek; Nathalie Japkowicz;
展开▼
作者单位

Department of Computer Science, University of Texas at San Antonio 6900 N. Loop 1604 W, San Antonio, Texas, 78249, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动推理、机器学习;
关键词
入库时间 2022-08-26 14:09:15

相似文献

外文文献
中文文献
专利

1. Erratum to 'Entropy-based fuzzy support vector machine for imbalanced datasets' [Knowl.-Based Syst. 115 (2017) 87-99] [J] . Rezvani Salim, Wang Xizhao Knowledge-Based Systems . 2020,第Mara15期

机译：勘误到“基于熵的不平衡数据集的模糊支持向量机” [基于Knowl。的系统115（2017）87-99]
2. Support vector machines for credit risk assessment with imbalanced datasets [J] . Sihem Khemakhem, Younes Boujelbene International journal of data mining, modelling and management . 2018,第2期

机译：支持向量机，用于不平衡数据集的信用风险评估
3. A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets [J] . Piri Saeed, Delen Dursun, Liu Tieming Decision support systems . 2018,第FEBa期

机译：利用支持向量机的综合信息性少数过度采样（SIMO）算法，可增强从不平衡数据集中的学习
4. Applying instance-weighted support vector machines to class imbalanced datasets [C] . Xiaoguang Wang, Xuan Liu, Matwin S., IEEE International Congress on Big Data . 2014

机译：将实例加权支持向量机应用于不平衡数据集
5. Active learning with support vector machines for imbalanced datasets and a method for stopping active learning based on stabilizing predictions. [D] . Bloodgood, Michael. 2009

机译：支持向量机用于不平衡数据集的主动学习，以及一种基于稳定预测的主动学习停止方法。
6. New Fuzzy Support Vector Machine for the Class Imbalance Problem in Medical Datasets Classification [O] . Xiaoqing Gu, Tongguang Ni, Hongyuan Wang -1

机译：用于医疗数据集分类中类别不平衡问题的新型模糊支持向量机
7. Applying support vector machines to imbalanced datasets [O] . Rehan Akbani, Stephen Kwek, Nathalie Japkowicz 2004

机译：将支持向量机应用于不平衡数据集

Applying Support Vector Machines to Imbalanced Datasets

摘要

著录项

相似文献

相关主题

期刊订阅