首页> 外文会议>SIAM International Conference on Data Mining >Twin Vector Machines for Online Learning on a Budget
【24h】

Twin Vector Machines for Online Learning on a Budget

机译:预算上的在线学习的双矢量机器

获取原文

摘要

This paper proposes Twin Vector Machine (TVM), a constant space and sublinear time Support Vector Machine (SVM) algorithm for online learning. TVM achieves its favorable scaling by maintaining only a fixed number of examples, called the twin vectors, and their associated information in memory during training. In addition, TVM guarantees that Kuhn-Tucker conditions are satisfied on all twin vectors at any time. To maximize the accuracy of TVM, twin vectors are adjusted during the training phase to approximate the data distribution near the decision boundary. Given a new training example, TVM is updated in three steps. First, the new example is added as a new twin vector if it is near the decision boundary. If this happens, two twin vectors are selected and merged into a single twin vector to maintain the budget. Finally, TVM is updated by incremental and decremental learning to account for the change. Several methods for twin vector merging were proposed and experimentally evaluated. TVMs were thoroughly tested on 12 large data sets. In most cases, the accuracy of low-budget TVMs was comparable to the state of the art resource-unconstrained SVMs. Additionally, the TVM accuracy was substantially larger than that of SVM trained on a random sample of the same size. Even larger difference in accuracy was observed when comparing to Forgetron, a popular kernel perceptron algorithm on a budget. The results illustrate that highly accurate online SVMs could be trained from large data streams using devices with severely limited memory budgets.
机译:本文提出了双向量机(TVM),恒定空间和汇总时间支持向量机(SVM)算法进行在线学习。 TVM通过在训练期间仅维护一个称为双向传感器的固定数量的示例以及它们在内存中的相关信息来实现其有利的缩放。此外,TVM保证随时对所有双胞胎向量满意的Kuhn-Tucker条件。为了最大限度地提高TVM的精度,在训练阶段进行调整双向,以近似决策边界附近的数据分布。鉴于新的培训示例,TVM以三个步骤更新。首先,如果靠近决策边界,则将新示例添加为新的双向载体。如果发生这种情况,则选择两个双向载体并合并到单个双向向量中以维持预算。最后,通过增量和递减学习来更新TVM来计算变更。提出了几种双载体合并的方法并进行了实验评估。 TVMS在12个大数据集上进行了彻底测试。在大多数情况下,低预算TVM的准确性与艺术资源 - 不受约束的SVM的状态相当。此外,TVM精度大大大于同一大小的随机样本的SVM培训的精度。在比较预算时比较遗忘时,观察到更大的准确性差异。结果说明,可以使用具有严重限制的内存预算的设备从大型数据流培训高度准确的在线SVM。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号