Large-scale machine learning using kernel methods.

机译：使用内核方法的大规模机器学习。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Kernel methods, such as Support Vector Machines (SVMs), are a core machine learning technology. They enjoy strong theoretical foundations and excellent empirical successes in many pattern-recognition applications. However, when kernel methods are applied to many emerging large-scale applications, such as video surveillance, multimedia information retrieval, and web mining, they suffer from the challenges of ineffective and inefficient training. In this dissertation, we explore these challenges and propose strategies to solve them.; We first investigate the imbalanced-training challenge which causes the training of kernel methods to be ineffective. The imbalance-training problem occurs when the training instances of the target class are significantly outnumbered by the other training instances. In such situations, we show the class boundary trained from SVMs can be severely skewed toward the target class. We propose using conformal transformation on the kernel function in Reproducing Kernel Hilbert Space for tackling the challenge.; The training performance of kernel methods greatly depends on the chosen kernel function or matrix. A kernel function or matrix defines a pairwise-similarity measurement between two data instances. We thus develop an algorithm to formulate a context-dependent distance function for measuring such similarity. We demonstrate that the learned distance function leads to improved performance for kernel-based clustering and classification tasks. Moreover, we also research the situations where the similarity measurement to formulate the kernel function might not induce a positive semi-definite (psd) kernel matrix, and hence cannot be used for training with kernel methods. We propose an analytical framework on evaluating several representative spectrum-transformation methods.; Finally, we address the efficiency of kernel methods to achieve fast training on massive data. Especially, we focus on Support Vector Machines. The traditional solutions of SVMs suffer from the widely-known scalability problem. We propose an incremental algorithm, which performs approximate matrix-factorization operations, to speed up SVMs. Two approximate factorization schemes, Kronecker and incomplete Cholesky, are utilized in the primal-dual interior-point method (IPM) to directly solve the quadratic optimization problem in SVMs.; Through theoretical analysis and extensive empirical studies, we show that our proposed approaches are able to perform more effectively, and efficiently, than traditional methods.

机译：诸如支持向量机（SVM）之类的内核方法是一种核心的机器学习技术。他们在许多模式识别应用程序中拥有强大的理论基础和出色的经验成功。但是，当内核方法应用于许多新兴的大型应用程序（例如视频监视，多媒体信息检索和Web挖掘）时，它们会遭受无效和低效培训的挑战。本文探讨了这些挑战，并提出了应对策略。我们首先研究了导致训练方法无效的不平衡训练挑战。当目标类别的训练实例明显多于其他训练实例时，就会发生不平衡训练问题。在这种情况下，我们显示从SVM训练的班级边界可能严重偏向目标班级。我们建议在“再现内核希尔伯特空间”中对内核函数使用共形变换来应对挑战。核方法的训练性能在很大程度上取决于所选的核函数或矩阵。核函数或矩阵定义两个数据实例之间的成对相似性度量。因此，我们开发了一种算法来制定上下文相关距离函数以测量此类相似性。我们证明了学习的距离函数可以提高基于内核的聚类和分类任务的性能。此外，我们还研究了用于度量核函数的相似性度量可能不会引发正半定（psd）核矩阵，因此不能用于核方法训练的情况。我们提出了一种评估几种代表性频谱转换方法的分析框架。最后，我们解决了内核方法在海量数据上实现快速训练的效率。特别是，我们专注于支持向量机。 SVM的传统解决方案存在广为人知的可伸缩性问题。我们提出一种增量算法，该算法执行近似矩阵分解操作，以加快SVM的速度。原始对偶内点法（IPM）中使用了两种近似的因式分解方案，即Kronecker和不完全Cholesky，来直接解决SVM中的二次优化问题。通过理论分析和广泛的经验研究，我们证明了我们提出的方法能够比传统方法更有效地执行。

著录项

作者
Wu, Gang.;
展开▼
作者单位

University of California, Santa Barbara.;

展开▼
授予单位 University of California, Santa Barbara.;
学科 Computer Science.
学位 Ph.D.
年度 2006
页码 168 p.
总页数 168
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A large-scale web QoS prediction scheme for the Industrial Internet of Things based on a kernel machine learning algorithm [J] . Luo Xiong, Liu Ji, Zhang Dandan, Computer networks . 2016,第juna4期

机译：基于核机器学习算法的工业物联网大规模Web QoS预测方案
2. An Extreme Learning Machine Based on the Mixed Kernel Function of Triangular Kernel and Generalized Hermite Dirichlet Kernel [J] . Senyue Zhang, Wenan Tan Discrete dynamics in nature and society . 2016,第Pta3期

机译：基于三角核和广义Hermite Dirichlet内核混合核函数的极端学习机
3. Constructing Bayesian formulations of sparse kernel learning methods. [J] . Cawley GC, Talbot NL Neural Networks: The Official Journal of the International Neural Network Society . 2005,第5a6期

机译：构造稀疏核学习方法的贝叶斯公式。
4. Large-scale nonlinear facial image classification based on approximate kernel Extreme Learning Machine [C] . Iosifidis Alexandros, Tefas Anastasios, Pitas Ioannis IEEE International Conference on Image Processing . 2015

机译：基于近似核极限学习机的大规模非线性人脸图像分类
5. Inference and Prediction for High Dimensional Data via Penalized Regression and Kernel Machine Methods. [D] . Minnier, Jessica Nicole. 2012

机译：通过罚回归和核机方法对高维数据进行推理和预测。
6. Validating the validation: reanalyzing a large-scale comparison of deep learning and machine learning models for bioactivity prediction [O] . Matthew C. Robinson, Robert C. Glen, Alpha A. Lee -1

机译：验证有效性：重新分析深度学习和机器学习模型的大规模比较以预测生物活性
7. Alzheimer's disease risk assessment using large-scale machine learning methods. [O] . Ramon Casanova, Fang-Chi Hsu, Kaycee M Sink, 2013

机译：使用大规模机器学习方法评估阿尔茨海默病风险。
8. Adaptive Kernel Based Machine Learning Methods. [R] . Xu, Y. 2012

机译：基于自适应核的机器学习方法。

Large-scale machine learning using kernel methods.

摘要

著录项

相似文献

相关主题

期刊订阅