Statistical enhancement of support vector machines.

机译：支持向量机的统计增强。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Support Vector Machines (SVM) and Random Forests (RF) have consistently outperformed other machine learning algorithms on a variety of problems. SVM can be used for classification and regression on many types of data (e.g. nonlinear, high dimensional), but cannot handle missing or mixed data. This research implements a permutation-based variable importance measure and missing value imputation method for SVM founded on similar techniques developed for RF.;The results of the SVMvariable importance measure are compared to RF results on simulated data sets with known variable importance. The variability of the importance outcomes are examined when different tuning parameter values (for SVM and RF) and kernels (for SVM) are used on benchmark data sets. Two of the benchmark data sets are also used to evaluate the missing value imputation method.;The variable importance measure developed in this study has comparable results to RF on the simulated data sets. However, the results have greater variability and less consistency than RF on the same benchmark data sets for the tuning parameter values investigated. SVM often had a smaller test error than RF, indicating that SVM was able to better fit the benchmark data. Unlike the RF results, the SVM variable importance results can be highly sensitive to the choice of tuning parameters. Successive grid searches are needed to tune these parameters and achieve more consistent SVM variable importance results.;This research compares a median-based missing value imputation method to a mean-based approach. The quality of the methods was evaluated by comparing the test set error (or test mean-square error) achieved after application to two benchmark data sets. There is improvement on a regression data set, but no significant difference in results for a classification example. Further investigation is needed to evaluate this imputation technique.;A variable importance measure for SVM provides insight into which explanatory variables are important in determining the response. SVM has been known to perform better than other machine learning algorithms on some data sets. By developing such a measure, this research has furthered the capabilities of an important algorithm used for data mining.

机译：支持向量机（SVM）和随机森林（RF）在各种问题上一直优于其他机器学习算法。 SVM可用于对多种类型的数据（例如非线性，高维）进行分类和回归，但无法处理丢失或混合的数据。本研究基于为RF开发的类似技术，为SVM实现了基于置换的变量重要性度量和缺失值插补方法。;将SVM变量重要性度量的结果与已知变量重要性的模拟数据集的RF结果进行了比较。当在基准数据集上使用不同的调整参数值（对于SVM和RF）和内核（对于SVM）时，将检查重要性结果的可变性。还使用了两个基准数据集来评估缺失值的估算方法。本研究中开发的可变重要性度量与模拟数据集的RF具有可比的结果。但是，对于所研究的调整参数值，在相同基准数据集上的结果比RF具有更大的可变性和更少的一致性。 SVM的测试误差通常比RF小，这表明SVM能够更好地适应基准数据。与RF结果不同，SVM变量重要性结果对调整参数的选择非常敏感。需要连续的网格搜索来调整这些参数并获得更一致的SVM变量重要性结果。该研究将基于中位数的缺失值插补方法与基于均值的方法进行了比较。通过将应用后的测试集误差（或测试均方误差）与两个基准数据集进行比较，来评估方法的质量。回归数据集有所改进，但分类示例的结果没有显着差异。需要进一步研究以评估这种插补技术。SVM的变量重要性度量可洞察哪些解释变量对确定响应很重要。众所周知，在某些数据集上，SVM的性能要优于其他机器学习算法。通过制定这样的措施，本研究进一步提高了用于数据挖掘的重要算法的功能。

著录项

作者
Taylor, Aimee E.;
展开▼
作者单位

Oregon State University.;

展开▼
授予单位 Oregon State University.;
学科 Statistics.;Computer Science.;Operations Research.
学位 Ph.D.
年度 2009
页码 156 p.
总页数 156
原文格式 PDF
正文语种 eng
中图分类统计学;运筹学;自动化技术、计算机技术;
关键词
入库时间 2022-08-17 11:37:38

相似文献

外文文献
中文文献
专利

1. Local quality assessment in homology models using statistical potentials and support vector machines. [J] . Fasnacht M, Zhu J, Honig B Protein Science: A Publication of the Protein Society . 2007,第8期

机译：使用统计潜力和支持向量机在同源性模型中进行本地质量评估。
2. Enhanced performance in prediction of protein active sites with THEMATICS and support vector machines. [J] . Tong W, Williams RJ, Wei Y, Protein Science: A Publication of the Protein Society . 2008,第2期

机译：使用THEMATICS和支持向量机提高了预测蛋白质活性位点的性能。
3. Support Vector Machines. Part 1: A theoretical overview [Support vector machines teil 1: Ein theoretischer überblick] [J] . Heinert M. ZFV: Zeitschrift fur Geodasie, Geoinformation und Landmanagement . 2010,第3期

机译：支持向量机。第1部分：理论概述[支持向量机第1部分：理论概述]
4. Hysteresis Nonlinearity Modeling for Magnetics Shape Mem-ory Alloy Actuator Based on a Novel Black-box Model with Least Squares Support Vector Machines. [C] . R. Xu, M. Zhou, Y. Wang IEEE International Magnetics Conference . 2018

机译：基于具有最小二乘支持向量机的新型黑匣子模型的磁性形状记忆合金致动器的磁滞非线性建模。
5. Forecasting Electric Vehicle Arrival & Departure Time on UCSD Campus using Support Vector Machines. [D] . Xu, Zhuo. 2017

机译：使用支持向量机预测UCSD校园中的电动汽车到达和离开时间。
6. Prediction of Overall In Vitro Microsomal Stability of Drug Candidates Based on Molecular Modeling and Support Vector Machines. Case Study of Novel Arylpiperazines Derivatives [O] . Szymon Ulenberg, Mariusz Belka, Marek Król, -1

机译：基于分子建模和支持向量机的候选药物总体体外微粒体稳定性预测。新型芳基哌嗪衍生物的案例研究
7. LMethyR-SVM: Predict Human Enhancers Using Low Methylated Regions based on Weighted Support Vector Machines. [O] . Jingting Xu, Hong Hu, Yang Dai 2016

机译：LmethyR-sVm：基于加权支持向量机使用低甲基化区域预测人类增强子。
8. Phonetic Speaker Recognition with Support Vector Machines. [R] . Campbell, W. M., Campbell, J. P., Reynolds, D. A., 2016

机译：支持向量机的语音说话人识别。

Statistical enhancement of support vector machines.

摘要

著录项

相似文献

相关主题

期刊订阅