首页> 外文学位 >Machine learning and data mining via mathematical programming-based support vector machines.
【24h】

Machine learning and data mining via mathematical programming-based support vector machines.

机译:通过基于数学编程的支持向量机进行机器学习和数据挖掘。

获取原文
获取原文并翻译 | 示例

摘要

Several issues that arise in machine learning and data mining are addressed using mathematical programming based support vector machines (SVMs). We address the following important problems. Instead of a standard SVM that classifies points by assigning them to one of two disjoint halfspaces, points are classified by assigning them to the closest of two parallel planes (in input or feature space) that are pushed apart as far as possible. This formulation leads to an extremely fast and simple algorithm for generating a linear or nonlinear classifier that merely requires the solution of a single system of nonsingular linear equations. Multiclass and incremental extensions of this proximal formulation are also presented.; Prior knowledge, in the form of multiple polyhedral sets each belonging to one of two categories, is introduced into a reformulation of a linear SVM classifier. The resulting formulation is solved efficiently by a linear program and results in enhanced testing set correctness.; A finite concave minimization algorithm is proposed for constructing classifiers that use a minimal number of data points both in generating and characterizing classifiers. The algorithm is theoretically justified by linear programming perturbation theory and a leave-one-out error bound, as well as by effective computational results on several real world datasets. Another very fast Newton based stand-alone algorithm to solve this problem is also presented.; The problem of incorporating unlabeled data into a support vector machine is formulated as a concave minimization problem on a polyhedral set for which a stationary point is quickly obtained by solving a few (5 to 7) linear programs.; We also propose an implicit Lagrangian formulation of a support vector machine classifier that results in a highly effective iterative scheme and that is solved here by a finite Newton method. The proposed method, which is extremely fast and terminates in 6 or 7 iterations, can handle classification problems in very high dimensional spaces, e.g. over 28,000, in a few seconds on a 400 MHz Pentium II machine. The method can also handle problems with large datasets and requires no specialized software other than a commonly available solver for a system of linear equations. Finite termination of the proposed method is established.; To sum up, we present several mathematical programming based algorithms that address various important SVM related issues such as: speed, scalability, data dependence and sparse representation, use of unlabeled data and knowledge incorporation.
机译:使用基于数学编程的支持向量机(SVM)解决了机器学习和数据挖掘中出现的几个问题。我们解决以下重要问题。不是通过将点分配给两个不相交的半空间之一来对点进行分类的标准SVM,而是通过将点分配给两个尽可能推开的平行平面中最接近的两个平行平面(在输入或特征空间中)来对点进行分类。这种表述导致了用于生成线性或非线性分类器的极其快速和简单的算法,该算法仅需要求解非奇异线性方程组的单个系统即可。还介绍了该近端公式的多类和增量扩展。将以分别属于两个类别之一的多个多面体集的形式的先验知识引入线性SVM分类器的重构中。通过线性程序有效地解决了生成的配方,并提高了测试设置的正确性。提出了一种有限凹最小化算法,用于构造分类器,该分类器在生成和表征分类器时均使用最少数量的数据点。该算法在理论上通过线性编程扰动理论和一劳永逸的误差界限以及在多个实际数据集上的有效计算结果来证明是正确的。还提出了另一种非常快速的基于牛顿的独立算法来解决该问题。将未标记数据合并到支持向量机中的问题被表述为多面体集合上的凹面最小化问题,通过求解几个(5至7个)线性程序可快速获得固定点。我们还提出了支持向量机分类器的隐式拉格朗日公式,该公式导致了高效的迭代方案,在此通过有限牛顿法求解。所提出的方法非常快并且终止于6或7次迭代,可以处理非常高维的空间中的分类问题。在一台400 MHz奔腾II机器上,在几秒钟内就超过了28,000。该方法还可以处理大型数据集的问题,除了线性方程组的通用求解器外,不需要任何专用软件。确定了所提出方法的有限终止。综上所述,我们提出了几种基于数学编程的算法,这些算法解决了与SVM相关的各种重要问题,例如:速度,可伸缩性,数据依赖性和稀疏表示,未标记数据的使用以及知识整合。

著录项

  • 作者

    Fung, Glenn Martin.;

  • 作者单位

    The University of Wisconsin - Madison.;

  • 授予单位 The University of Wisconsin - Madison.;
  • 学科 Mathematics.; Computer Science.
  • 学位 Ph.D.
  • 年度 2003
  • 页码 191 p.
  • 总页数 191
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 数学;自动化技术、计算机技术;
  • 关键词

  • 入库时间 2022-08-17 11:44:47

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号