首页> 外文会议>IEEE International Conference on Research, Innovation and Vision for the Future >Two-stage Incremental Working Set Selection for Fast Support Vector Training on Large Datasets
【24h】

Two-stage Incremental Working Set Selection for Fast Support Vector Training on Large Datasets

机译:大型数据集快速支持向量训练的两阶段增量工作集选择

获取原文

摘要

We propose iSVM - an incremental algorithm that achieves high speed in training support vector machines (SVMs) on large datasets. In the common decomposition framework, iSVM starts with a minimum working set (WS), and then iteratively selects one training example to update the WS in each optimization loop. iSVM employs a two-stage strategy in processing the training data. In the first stage, the most prominent vector among randomly sampled data is added to the WS. This stage results in an approximate SVM solution. The second stage uses temporal solutions to scan through the whole training data once again to find the remaining support vectors (SVs). We show that iSVM is especially efficient for training SVMs on applications where data size is much larger than number of SVs. On the KDD-CUP 1999 network intrusion detection dataset with nearly five millions training examples, iSVM takes less than one hour to train an SVM with 94% testing accuracy, compared to seven hours with LibSVM - one of the state-of-the-art SVM implementations. We also provide analysis and experimental comparisons between iSVM and the related algorithms.
机译:我们提出了ISVM - 一种增量算法,可以在大型数据集上实现高速训练支持向量机(SVM)。在公共分解框架中,ISVM以最小工作集(WS)开始,然后迭代地选择一个训练示例以在每个优化循环中更新WS。 ISVM在处理培训数据时采用两级策略。在第一阶段,随机采样数据中最突出的向量添加到WS。该阶段导致近似的SVM解决方案。第二阶段使用时间解决方案再次扫描整个训练数据,以查找剩余的支持向量(SVS)。我们表明ISVM对于培训SVMS在数据大于SV的数量大于SV的应用中特别有效。在KDD-Cup 1999年网络入侵检测数据集具有近五百万培训的实例中,ISVM花费不到一小时才能培训具有94%的测试精度的SVM,而Libsvm与Libsvm相比 - 最先进的SVM - 其中一个SVM实现。我们还提供ISVM与相关算法之间的分析和实验比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号