...
首页> 外文期刊>BMC Bioinformatics >Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data
【24h】

Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data

机译:弹性SCAD作为高维数据中SVM分类任务的一种新型惩罚方法

获取原文

摘要

Background Classification and variable selection play an important role in knowledge discovery in high-dimensional data. Although Support Vector Machine (SVM) algorithms are among the most powerful classification and prediction methods with a wide range of scientific applications, the SVM does not include automatic feature selection and therefore a number of feature selection procedures have been developed. Regularisation approaches extend SVM to a feature selection method in a flexible way using penalty functions like LASSO, SCAD and Elastic Net. We propose a novel penalty function for SVM classification tasks, Elastic SCAD, a combination of SCAD and ridge penalties which overcomes the limitations of each penalty alone. Since SVM models are extremely sensitive to the choice of tuning parameters, we adopted an interval search algorithm, which in comparison to a fixed grid search finds rapidly and more precisely a global optimal solution. Results Feature selection methods with combined penalties (Elastic Net and Elastic SCAD SVMs) are more robust to a change of the model complexity than methods using single penalties. Our simulation study showed that Elastic SCAD SVM outperformed LASSO (L1) and SCAD SVMs. Moreover, Elastic SCAD SVM provided sparser classifiers in terms of median number of features selected than Elastic Net SVM and often better predicted than Elastic Net in terms of misclassification error. Finally, we applied the penalization methods described above on four publicly available breast cancer data sets. Elastic SCAD SVM was the only method providing robust classifiers in sparse and non-sparse situations. Conclusions The proposed Elastic SCAD SVM algorithm provides the advantages of the SCAD penalty and at the same time avoids sparsity limitations for non-sparse data. We were first to demonstrate that the integration of the interval search algorithm and penalized SVM classification techniques provides fast solutions on the optimization of tuning parameters. The penalized SVM classification algorithms as well as fixed grid and interval search for finding appropriate tuning parameters were implemented in our freely available R package 'penalizedSVM'. We conclude that the Elastic SCAD SVM is a flexible and robust tool for classification and feature selection tasks for high-dimensional data such as microarray data sets.
机译:背景分类和变量选择在高维数据的知识发现中起着重要作用。尽管支持向量机(SVM)算法是具有广泛科学应用的最强大的分类和预测方法之一,但SVM不包括自动特征选择,因此已经开发了许多特征选择程序。正则化方法使用惩罚函数(例如LASSO,SCAD和Elastic Net)以灵活的方式将SVM扩展为特征选择方法。我们为SVM分类任务提出了一种新颖的惩罚函数,即弹性SCAD,SCAD和岭惩罚的组合,它克服了每种惩罚本身的局限性。由于SVM模型对调整参数的选择极为敏感,因此我们采用了间隔搜索算法,与固定网格搜索相比,该算法可以快速,更精确地找到全局最优解。结果具有组合惩罚的特征选择方法(Elastic Net和Elastic SCAD SVM)比使用单一惩罚的方法更健壮,可以更有效地改变模型的复杂性。我们的仿真研究表明,Elastic SCAD SVM优于LASSO(L 1 )和SCAD SVM。此外,Elastic SCAD SVM提供的稀疏分类器的特征数量要比Elastic Net SVM少,并且在误分类错误方面通常比Elastic Net更好。最后,我们将上述惩罚方法应用于四个公开可用的乳腺癌数据集。 Elastic SCAD SVM是在稀疏和非稀疏情况下提供可靠分类器的唯一方法。结论所提出的Elastic SCAD SVM算法提供了SCAD惩罚的优势,同时避免了非稀疏数据的稀疏性限制。我们首先证明了区间搜索算法和惩罚SVM分类技术的集成为优化调整参数提供了快速的解决方案。在我们免费提供的R包“ penalizedSVM”中,实施了惩罚性SVM分类算法以及固定网格和区间搜索以找到合适的调整参数。我们得出结论,Elastic SCAD SVM是一种灵活而强大的工具,可用于对高维数据(如微阵列数据集)进行分类和特征选择任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号