A fast SVM training method for very large datasets

机译：针对超大型数据集的快速SVM训练方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In a standard support vector machine (SVM), the training process has O(n³) time and O(n²) space complexities, where n is the size of training dataset. Thus, it is computationally infeasible for very large datasets. Reducing the size of training dataset is naturally considered to solve this problem. SVM classifiers depend on only support vectors (SVs) that lie close to the separation boundary. Therefore, we need to reserve the samples that are likely to be SVs. In this paper, we propose a method based on the edge detection technique to detect these samples. To preserve the entire distribution properties, we also use a clustering algorithm such as K-means to calculate the centroids of clusters. The samples selected by edge detector and the centroids of clusters are used to reconstruct the training dataset. The reconstructed training dataset with a smaller size makes the training process much faster, but without degrading the classification accuracies.

机译：在标准支持向量机（SVM）中，训练过程具有O（n ^{3 ）时间和O（n ^{2 ）空间复杂度，其中n是训练的大小数据集。因此，对于非常大的数据集，在计算上是不可行的。减小训练数据集的大小自然可以解决该问题。 SVM分类器仅依赖位于分离边界附近的支持向量（SV）。因此，我们需要保留可能是SV的样本。在本文中，我们提出了一种基于边缘检测技术的方法来检测这些样本。为了保留整个分布属性，我们还使用聚类算法（例如K-means）来计算聚类的质心。边缘检测器选择的样本和聚类的质心用于重建训练数据集。重建的训练数据集较小，可以使训练过程更快，但不会降低分类精度。}}

著录项

来源
《International Joint Conference on Neural Networks;IJCNN 2009》|2009年|1784-1789|共6页
会议地点 Atlanta GA(US);Atlanta GA(US)
作者
Boyang Li; Qiangwei Wang; Jinglu Hu;
展开▼
作者单位

Grad. Sch. of Inf., Production Syst., Waseda Univ., Kitakyushu, Japan;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
computational complexity; edge detection; pattern clustering; support vector machines; very large databases; K-means; SVM classifiers; classification accuracies; clustering algorithm; edge detection technique; fast SVM training method; space complexities; support vector machine; time complexities; training dataset; training process; very large datasets;

机译：计算复杂度;边缘检测;模式聚类;支持向量机;超大型数据库; K-均值; SVM分类器;分类精度;聚类算法;边缘检测技术;快速SVM训练方法;空间复杂度;支持向量机;时间复杂度;训练数据集;训练过程;非常大的数据集;

相似文献

外文文献
中文文献
专利

1. An Improved TA-SVM Method Without Matrix Inversion and Its Fast Implementation for Nonstationary Datasets [J] . Shi Yingzhong, Chung Fu-Lai, Wang Shitong Neural Networks and Learning Systems, IEEE Transactions on . 2015,第9期

机译：改进的不支持矩阵求逆的TA-SVM方法及其对非平稳数据集的快速实现
2. Fast methods for training Gaussian processes on large datasets [J] . C. J. Moore, A. J. K. Chua, C. P. L. Berry, Royal Society Open Science . 2016,第5期

机译：在大型数据集上训练高斯过程的快速方法
3. Fast methods for training Gaussian processes on large datasets [J] . C. J. Moore, A. J. K. Chua, C. P. L. Berry, Royal Society Open Science . 2016,第5期

机译：在大型数据集上训练高斯过程的快速方法
4. A Fast SVM Training Method for Very Large Datasets [C] . Boyang LI, Qiangwei WANG, Jinglu HU International Joint Conference on Neural Networks . 2009

机译：非常大型数据集的快速SVM训练方法
5. Fast Bayesian methods for genetic mapping applicable for high-throughput datasets. [D] . Chang, Yu-Ling. 2008

机译：用于高通量数据集的遗传映射快速贝叶斯方法。
6. Fast methods for training Gaussian processes on large datasets [O] . C. J. Moore, A. J. K. Chua, C. P. L. Berry, 2016

机译：在大型数据集上训练高斯过程的快速方法
7. An improved TA-SVM method without matrix inversion and its fast implementation for nonstationary datasets [O] . Shi Y, Chung FL, Wang S 2015

机译：一种改进的不支持矩阵求逆的TA-SVM方法及其对非平稳数据集的快速实现
8. Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets [R] . Madduri, K., Ediger, D., Jiang, K., 2008

机译：更快的并行算法和高效的多线程实现，用于评估海量数据集的中介中心性

A fast SVM training method for very large datasets

摘要

著录项

相似文献

相关主题

期刊订阅