首页> 外国专利> Method and apparatus for efficient training of support vector machines

Method and apparatus for efficient training of support vector machines

机译:用于有效训练支持向量机的方法和设备

摘要

The present invention provides a system and method for building fast and efficient support vector classifiers for large data classification problems which is useful for classifying pages from the World Wide Web and other problems with sparse matrices and large numbers of documents. The method takes advantage of the least squares nature of such problems, employs exact line search in its iterative process and makes use of a conjugate gradient method appropriate to the problem. In one embodiment a support vector classifier useful for classifying a plurality of documents, including textual documents, is built by selecting a plurality of training documents, each training document having suitable numeric attributes which are associated with a training document vector, then initializing a classifier weight vector and a classifier intercept for a classifier boundary, the classifier boundary separating at least two document classes, then determining which training document vectors are suitable support vectors, and then re-computing the classifier weight vector and the classifier intercept for the classifier boundary using the suitable support vectors together with an iteratively reindexed least squares method and a conjugate gradient method with a stopping criterion.
机译:本发明提供了一种用于为大型数据分类问题构建快速有效的支持向量分类器的系统和方法,该系统和方法可用于对来自万维网的页面以及稀疏矩阵和大量文档的其他问题进行分类。该方法利用了此类问题的最小二乘性质,在其迭代过程中采用了精确的线搜索,并使用了适合该问题的共轭梯度法。在一个实施例中,通过选择多个训练文档来构建用于分类包括文本文档的多个文档的支持向量分类器,每个训练文档具有与训练文档向量相关联的合适的数值属性,然后初始化分类器权重。向量和分类器边界的分类器截距,分类器边界将至少两个文档类别分开,然后确定哪些训练文档向量是合适的支持向量,然后使用来重新计算分类器权重向量和分类器边界合适的支持向量,以及带有停止准则的迭代重新索引最小二乘法和共轭梯度方法。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号