Sparse Proximal Support Vector Machines for feature selection in high dimensional datasets

Pappu Vijay; Panagopoulos Orestis P.; Xanthopoulos Petros; Pardalos Panos M.

首页> 外文期刊>Expert Systems with Application >Sparse Proximal Support Vector Machines for feature selection in high dimensional datasets

【24h】

Sparse Proximal Support Vector Machines for feature selection in high dimensional datasets

机译：稀疏近邻支持向量机，用于高维数据集中的特征选择

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Classification of High Dimension Low Sample Size (HDLSS) datasets is a challenging task in supervised learning. Such datasets are prevalent in various areas including biomedical applications and business analytics. In this paper, a new embedded feature selection method for HDLSS datasets is introduced by incorporating sparsity in Proximal Support Vector Machines (PSVMs). Our method, called Sparse Proximal Support Vector Machines (sPSVMs), learns a sparse representation of PSVMs by first casting it as an equivalent least squares problem and then introducing the l(1)-norm for sparsity. An efficient algorithm based on alternating optimization techniques is proposed. sPSVMs remove more than 98% of features in many high dimensional datasets without compromising on generalization performance. Stability in the feature selection process of sPSVMs is also studied and compared with other univariate filter techniques. Additionally, sPSVMs offer the advantage of interpreting the selected features in the context of the classes by inducing class-specific local sparsity instead of global sparsity like other embedded methods. sPSVMs appear to be robust with respect to data dimensionality. Moreover, sPSVMs are able to perform feature selection and classification in one step, eliminating the need for dimensionality reduction on the data. To that end, sPSVMs can be used for preprocessing free classification tasks. (C) 2015 Elsevier Ltd. All rights reserved.

机译：高维低样本量（HDLSS）数据集的分类是监督学习中的一项艰巨任务。这样的数据集在包括生物医学应用和业务分析在内的各个领域中普遍存在。本文通过将稀疏性纳入近邻支持向量机（PSVM）中，介绍了一种用于HDLSS数据集的新的嵌入式特征选择方法。我们的方法称为稀疏近距离支持向量机（sPSVM），它通过首先将PSVM转换为等效的最小二乘问题，然后引入稀疏性的l（1）范数来学习PSVM的稀疏表示。提出了一种基于交替优化技术的高效算法。 sPSVM可以删除许多高维数据集中超过98％的特征，而不会影响泛化性能。还研究了sPSVM的特征选择过程中的稳定性，并将其与其他单变量过滤技术进行了比较。此外，sPSVM具有通过在类的上下文中诱导特定于类的局部稀疏性而不是像其他嵌入式方法那样全局性稀疏性来解释所选功能的优势。 sPSVM在数据维度方面似乎很健壮。而且，sPSVM能够一步执行特征选择和分类，从而无需减少数据的维数。为此，可以将sPSVM用于预处理免费分类任务。（C）2015 Elsevier Ltd.保留所有权利。

著录项

来源
《Expert Systems with Application》 |2015年第23期|9183-9191|共9页
作者
Pappu Vijay; Panagopoulos Orestis P.; Xanthopoulos Petros; Pardalos Panos M.;
展开▼
作者单位

Univ Florida, Dept Ind Engn, Gainesville, FL 32608 USA;

Univ Cent Florida, Dept Ind Engn & Management Syst, Orlando, FL 32816 USA;

Univ Cent Florida, Dept Ind Engn & Management Syst, Orlando, FL 32816 USA;

Univ Florida, Dept Ind Engn, Gainesville, FL 32608 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Embedded feature selection; Sparsity; Regularization; Class-specific feature selection; High dimensional datasets;

机译：嵌入式特征选择;稀疏性;正则化;特定类特征选择;高维数据集;

相似文献

外文文献
中文文献
专利

1. Classification and Feature Selection Method for Medical Datasets by Brain Storm Optimization Algorithm and Support Vector Machine [J] . Eva Tuba, Ivana Strumberger, Timea Bezdan, Procedia Computer Science . 2019,第41期

机译：基于头脑风暴优化算法和支持向量机的医学数据集分类与特征选择方法
2. Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification [J] . Mustafa SerterUzer, NihatYilmaz, OnurInan ScientificWorldJournal . 2013,第3期

机译：基于人工蜂菌落算法的特征选择方法和医疗数据集分类的支持向量机
3. Selecting Features Subsets Based on Support Vector Machine-Recursive Features Elimination and One Dimensional-Na?ve Bayes Classifier using Support Vector Machines for Classification of Prostate and Breast Cancer [J] . Alhadi Bustamam, Anas Bachtiar, Devvi Sarwinda Procedia Computer Science . 2019,第11期

机译：基于支持向量机递归特征消除和使用支持向量机对前列腺癌和乳腺癌进行分类的一维朴素贝叶斯分类器选择特征子集
4. A NOVEL MULTI-SURFACE PROXIMAL SUPPORT VECTOR MACHINE CLASSIFICATION MODEL INCORPORATING FEATURE SELECTION [C] . MING YANG, SHUANG WEI 2009 International Conference on Machine Learning and Cybernetics（2009机器学习与控制论国际会议）论文集 . 2009

机译：包含特征选择的新型多表面近邻支持向量机分类模型
5. Robust and efficient feature selection for high-dimensional datasets. [D] . Mo, Dengyao. 2011

机译：高维数据集的稳健而高效的特征选择。
6. Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification [O] . Mustafa Serter Uzer, Nihat Yilmaz, Onur Inan 2013

机译：基于人工蜂群算法和支持向量机的医学数据集特征选择方法
7. Can dimensionality reduction through feature extraction improve classification accuracy compared to whole-brain analysis?: Using high-dimensional neuroimaging data as input for a support vector machine to distinguish alzheimer patients from healthy controls [O] . Broers Thomas 2017

机译：与全脑分析相比，通过特征提取进行降维是否可以提高分类准确性？：使用高维神经影像数据作为支持向量机的输入，以区分阿尔茨海默氏症患者与健康对照

Sparse Proximal Support Vector Machines for feature selection in high dimensional datasets

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅