Scalable kernel methods for machine learning.

机译：用于机器学习的可扩展内核方法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Machine learning techniques are now essential for a diverse set of applications in computer vision, natural language processing, software analysis, and many other domains. As more applications emerge and the amount of data continues to grow, there is a need for increasingly powerful and scalable techniques. Kernel methods, which generalize linear learning methods to non-linear ones, have become a cornerstone for much of the recent work in machine learning and have been used successfully for many core machine learning tasks such as clustering, classification, and regression.;Despite the recent popularity in kernel methods, a number of issues must be tackled in order for them to succeed on large-scale data. First, kernel methods typically require memory that grows quadratically in the number of data objects, making it difficult to scale to large data sets. Second, kernel methods depend on an appropriate kernel function---an implicit mapping to a high-dimensional space---which is not clear how to choose as it is dependent on the data. Third, in the context of data clustering, kernel methods have not been demonstrated to be practical for real-world clustering problems.;This thesis explores these questions, offers some novel solutions to them, and applies the results to a number of challenging applications in computer vision and other domains. We explore two broad fundamental problems in kernel methods. First, we introduce a scalable framework for learning kernel functions based on incorporating prior knowledge from the data. This framework scales to very large data sets of millions of objects, can be used for a variety of complex data, and outperforms several existing techniques. In the transductive setting, the method can be used to learn low-rank kernels, whose memory requirements are linear in the number of data points. We also explore extensions of this framework and applications to image search problems, such as object recognition, human body pose estimation, and 3-d reconstructions. As a second problem, we explore the use of kernel methods for clustering. We show a mathematical equivalence between several graph cut objective functions and the weighted kernel k-means objective. This equivalence leads to the first eigenvector-free algorithm for weighted graph cuts, which is thousands of times faster than existing state-of-the-art techniques while using significantly less memory. We benchmark this algorithm against existing methods, apply it to image segmentation, and explore extensions to semi-supervised clustering

机译：如今，机器学习技术对于计算机视觉，自然语言处理，软件分析和许多其他领域中的各种应用至关重要。随着更多应用程序的出现和数据量的不断增长，需要越来越强大和可扩展的技术。内核方法将线性学习方法推广为非线性方法，已成为机器学习最新工作的基石，并已成功用于许多核心机器学习任务，例如聚类，分类和回归。最近在内核方法中很流行，必须解决许多问题才能使它们在大规模数据上获得成功。首先，内核方法通常需要内存，该内存的数据对象数量呈二次方增长，从而很难扩展到大型数据集。其次，内核方法依赖于适当的内核功能-到高维空间的隐式映射-尚不清楚如何选择，因为它取决于数据。第三，在数据聚类的背景下，还没有证明内核方法可以解决现实中的聚类问题。本论文探讨了这些问题，为它们提供了一些新颖的解决方案，并将结果应用于许多具有挑战性的应用中。计算机视觉和其他领域。我们探索内核方法中的两个基本问题。首先，我们引入可扩展的框架，用于基于合并数据的先验知识来学习内核功能。该框架可扩展到具有数百万个对象的超大型数据集，可用于各种复杂数据，并且性能优于几种现有技术。在转导设置中，该方法可用于学习低级内核，其内存要求在数据点数量上是线性的。我们还探索了此框架的扩展和应用到图像搜索问题，例如对象识别，人体姿势估计和3-d重建。第二个问题，我们探索使用内核方法进行聚类。我们展示了几个图割目标函数和加权核k均值目标之间的数学等价关系。这种等效性导致了第一个无特征向量的加权图割算法，该算法比现有的最新技术快数千倍，而占用的内存却少得多。我们对照现有方法对该算法进行基准测试，将其应用于图像分割，并探索对半监督聚类的扩展

著录项

作者
Kulis, Brian Joseph.;
展开▼
作者单位

The University of Texas at Austin.;

展开▼
授予单位 The University of Texas at Austin.;
学科 Computer Science.
学位 Ph.D.
年度 2008
页码 205 p.
总页数 205
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Scaling the kernel function based on the separating boundary in input space: A data-dependent way for improving the performance of kernel methods [J] . Sun J., Li X., Yang Y., Information Sciences: An International Journal . 2012,第Null期

机译：基于输入空间中的分隔边界缩放内核函数：一种依赖数据的方式来改善内核方法的性能
2. Prediction of ozone hourly concentrations by support vector machine and kernel extreme learning machine using wavelet transformation and partial least squares methods [J] . Xiaoqian Su, Junlin An, Yuxin Zhang, Atmospheric Pollution Research . 2020,第6期

机译：使用小波变换和局部最小二乘法通过支持向量机和内核极端学习机预测臭氧每小时浓度
3. Support Vector Machines and Kernel Methods: The New Generation of Learning Machines [J] . Nello Cristianini, Bemhard Schoelkopf AI Magazine . 2002,第3期

机译：支持向量机和内核方法：新一代学习机
4. Machine learning. A method of approximation of discriminant functions and two methods of estimation of a posterior probabilities of classes in the problem of classification [C] . V. V. Zenkov 2017 Tenth International Conference Management of Large-Scale System Development . 2017

机译：机器学习。判别函数的逼近方法和分类问题中类的后验概率的两种估算方法
5. Large-scale machine learning using kernel methods. [D] . Wu, Gang. 2006

机译：使用内核方法的大规模机器学习。
6. Support Vector Machines Trained with Evolutionary Algorithms Employing Kernel Adatron for Large Scale Classification of Protein Structures [O] . Nancy Arana-Daniel, Alberto A. Gallegos, Carlos López-Franco, 2016

机译：支持向量机受进化算法训练采用核Adatron进行蛋白质结构的大规模分类
7. Bootstrapping Named Entity Annotation by Means of Active Machine Learning. A method for creating corpora [O] . Olsson Fredrik 2008

机译：通过主动机器学习引导命名实体注释。一种创建语料库的方法

Scalable kernel methods for machine learning.

摘要

著录项

相似文献

相关主题

期刊订阅