Integration of feature vector selection and support vector machine for classification of imbalanced data

Liu Jie; Zio Enrico

首页> 外文期刊>Applied Soft Computing >Integration of feature vector selection and support vector machine for classification of imbalanced data

【24h】

Integration of feature vector selection and support vector machine for classification of imbalanced data

机译：集成功能矢量选择和支持向量机，用于分类数据分类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Support Vector Machine (SVM) has been widely developed for tackling classification problems. Imbalanced data exist in many practical classification problems where the minority class is usually the one of interest. Undersampling is a popular solution for such problems. However, it has the risk of losing useful information in the original data. At the same time, tuning the hyperparameters in SVM is also challenging. By analyzing the geometrical meaning of kernel methods, an approach is proposed in this paper that combines a modified Feature Vector Selection (FVS) method with maximal between-class separability and an easy-tuning version of SVM, i.e. Feature Vector Regression (FVR) proposed in our previous work. In this paper, the modified FVS method selects a small number of data points that can represent linearly all the dataset in the Reproducing Kernel Hilbert Space (RKHS) and the selected data points give also a maximal separability of the imbalanced data in RKHS. The FVR model is also solved analytically, as in least-squared SVM. The decision threshold for classification is optimized to maximize the predefined accuracy metric. Twenty-six imbalanced datasets are considered and comparisons are carried out with several SVM-based methods for imbalanced data. Statistical test shows the effectiveness of the proposed method. (C) 2018 Elsevier B.V. All rights reserved.

机译：支持向量机（SVM）已被广泛开发用于解决分类问题。在许多实际分类问题中存在不平衡数据，其中少数阶级通常是兴趣之一。 under采样是对这些问题的流行解决方案。但是，它具有在原始数据中丢失有用信息的风险。与此同时，调整SVM中的封闭表也是具有挑战性的。通过分析内核方法的几何含义，在本文中提出了一种方法，该方法将修改的特征向量选择（FVS）方法组合在级别的可分离性和易于调整版本的SVM中，即提出的易于调谐版本（FVR）在我们以前的工作中。在本文中，修改的FVS方法选择少量的数据点，其可以在再现内核Hilbert空间（RKHS）中线性地表示线性所有数据集，并且所选数据点也提供了RKHS中的不平衡数据的最大可分性。 FVR模型也在分析上进行解决，如在最小平方的SVM中。分类的判定阈值被优化，以最大化预定义的精度度量。考虑二十六个不平衡数据集，并使用基于几种基于SVM的方法进行比较进行比较。统计测试显示了所提出的方法的有效性。（c）2018 Elsevier B.v.保留所有权利。

著录项

来源
《Applied Soft Computing》 |2019年第2019期|共10页
作者
Liu Jie; Zio Enrico;
展开▼
作者单位

Beihang Univ Sch Reliabil &

Syst Engn 37 Xueyuan Rd Beijing Peoples R China;

Politecn Milan Energy Dept Milan Italy;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词
Classification; Feature Vector Selection; Imbalanced data; Support Vector Machine; Separability;

机译：分类;特征矢量选择;不平衡数据;支持向量机;可分离性;

相似文献

外文文献
中文文献
专利

1. Integration of feature vector selection and support vector machine for classification of imbalanced data [J] . Liu Jie, Zio Enrico Applied Soft Computing . 2019,第期

机译：集成功能矢量选择和支持向量机，用于分类数据分类
2. Classifier transfer with data selection strategies for online support vector machine classification with class imbalance [J] . Mario Michael Krell, Nils Wilshusen, Anett Seeland, Journal of neural engineering . 2017,第2期

机译：带数据选择策略的分类器传递，用于类别不平衡的在线支持向量机分类
3. Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines [J] . Sebastián Maldonado, Richard Weber, Fazel Famili Information Sciences: An International Journal . 2014,第Null期

机译：使用支持向量机的高维类不平衡数据集特征选择
4. Classification of cancer data based on support vectors machines with feature selection using genetic algorithm and Laplacian score [C] . Z. Rustam, I. Primasari, D. Widya International Symposium on Current Progress in Mathematics and Sciences . 2018

机译：基于支持向量机器的癌症数据分类使用遗传算法和拉普拉斯评分的特征选择
5. Support vector machine/regression feature selection with an application towards classification. [D] . Halstead, John Brantley. 2005

机译：支持向量机/回归特征选择以及分类应用。
6. Enhancement of hepatitis virus immunoassay outcome predictions in imbalanced routine pathology data by data balancing and feature selection before the application of support vector machines [O] . Alice M. Richardson, Brett A. Lidbury 2017

机译：在支持向量机应用之前通过数据平衡和特征选择来增强不平衡常规病理数据中肝炎病毒免疫测定结果的预测
7. Classification of cancer data based on support vectors machines with feature selection using genetic algorithm and Laplacian score [O] . Z. Rustam, I. Primasari, D. Widya 2018

机译：基于支持向量机器的癌症数据分类使用遗传算法和拉普拉斯评分的特征选择

Integration of feature vector selection and support vector machine for classification of imbalanced data

摘要

著录项

相似文献

相关主题

期刊订阅