首页> 外文会议>International Conference on Machine Learning >Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction
【24h】

Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction

机译:通过同时特征和样品进行缩放稀疏支持向量机

获取原文

摘要

Sparse support vector machine (SVM) is a popular classification technique that can simultaneously learn a small set of the most interpretable features and identify the support vectors. It has achieved great successes in many real-world applications. However, for large-scale problems involving a huge number of samples and extremely high-dimensional features, solving sparse SVM-s remains challenging. By noting that sparse SVMs induce sparsities in both feature and sample spaces, we propose a novel approach, which is based on accurate estimations of the primal and dual optima of sparse SVMs, to simultaneously identify the features and samples that are guaranteed to be irrelevant to the outputs. Thus, we can remove the identified inactive samples and features from the training phase, leading to substantial savings in both the memory usage and computational cost without sacrificing accuracy. To the best of our knowledge, the proposed method is the first static feature and sample reduction method for sparse SVM. Experiments on both synthetic and real datasets (e.g., the kddb dataset with about 20 million samples and 30 million features) demonstrate that our approach significantly outperforms state-of-the-art methods and the speedup gained by our approach can be orders of magnitude.
机译:稀疏支持向量机(SVM)是一种流行的分类技术,可以同时学习一小部分最具可解释的功能并识别支持向量。它取得了许多现实世界的应用成功。然而,对于涉及大量样品和极高维度特征的大规模问题,解决稀疏SVM-S仍然具有挑战性。通过注意到稀疏的SVM诱导两个特征和示例空间中的稀疏性,我们提出了一种新的方法,该方法是基于稀疏SVM的原始和双最佳最佳的准确估计,同时识别保证与之无关的特征和样本输出。因此,我们可以从训练阶段删除所识别的非活动样本和特征,从不牺牲准确度,在内存使用和计算成本中,可以显着节省。据我们所知,所提出的方法是稀疏SVM的第一个静态特征和样品还原方法。合成和真实数据集的实验(例如,具有大约2000万个样本和3000万个特征的KDDB数据集)表明我们的方法显着优于最先进的方法,并通过我们的方法获得的加速可以是数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号