Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning

机译：机器学习随机逼近算法的非渐近分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the minimization of a convex objective function defined on a Hilbert space, which is only available through unbiased estimates of its gradients. This problem includes standard machine learning algorithms such as kernel logistic regression and least-squares regression, and is commonly referred to as a stochastic approximation problem in the operations research community. We provide a non-asymptotic analysis of the convergence of two well-known algorithms, stochastic gradient descent (a.k.a. Robbins-Monro algorithm) as well as a simple modification where iterates are averaged (a.k.a. Polyak-Ruppert averaging). Our analysis suggests that a learning rate proportional to the inverse of the number of iterations, while leading to the optimal convergence rate in the strongly convex case, is not robust to the lack of strong convexity or the setting of the proportionality constant. This situation is remedied when using slower decays together with averaging, robustly leading to the optimal rate of convergence. We illustrate our theoretical results with simulations on synthetic and standard datasets.

机译：我们考虑在希尔伯特空间上定义的凸目标函数的最小化，这只能通过对其梯度的无偏估计来获得。这个问题包括标准的机器学习算法，例如内核逻辑回归和最小二乘回归，在运筹学界通常被称为随机近似问题。我们提供了两种著名算法的收敛性的非渐近分析，即随机梯度下降法（又称Robbins-Monro算法）以及对迭代进行平均的简单修改（又称Polyak-Ruppert平均）。我们的分析表明，与迭代次数成反比的学习速率，虽然在强凸情况下会导致最佳收敛速度，但对于缺乏强凸性或比例常数的设置并不稳健。当使用更慢的衰减并求平均值时，可以纠正这种情况，从而稳健地导致最佳收敛速度。我们通过对合成数据集和标准数据集进行仿真来说明我们的理论结果。

著录项

来源
《Annual conference on Neural Information Processing Systems》|2012年|p.451-459|共9页
会议地点
作者
Francis Bach; Eric Moulines;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Machine Learning Approximation Algorithms for High-Dimensional Fully Nonlinear Partial Differential Equations and Second-order Backward Stochastic Differential Equations [J] . Beck Christian, Weinan E., Jentzen Arnulf Journal of nonlinear science . 2019,第4期

机译：高维全非线性偏微分方程和二阶向后随机微分方程的机器学习近似算法
2. Learning algorithms for separable approximations of discrete stochastic optimization problems [J] . Powell W, Ruszczynski A, Topaloglu H Mathematics of operations research . 2004,第4期

机译：离散随机优化问题的可分逼近的学习算法
3. Trust-region algorithms for training responses: machine learning methods using indefinite Hessian approximations [J] . Optimization methods & software . 2020,第1a3期

机译：用于训练响应的信任区域算法：使用无限期Hessian近似的机器学习方法
4. Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning [C] . Francis Bach, Eric Moulines Annual conference on Neural Information Processing Systems . 2011

机译：机器学习随机近似算法的非渐近分析
5. Stochastic approximation of artificial neural network-type learning algorithms: A dynamical systems approach. [D] . Ncube, Israel. 2001

机译：人工神经网络型学习算法的随机逼近：一种动态系统方法。
6. Performance analysis of the simultaneous perturbation stochastic approximation algorithm on the noisy sphere model [O] . Steffen Finck, Hans-Georg Beyer -1

机译：噪声球模型同时摄动随机逼近算法的性能分析。
7. Learning algorithms for separable approximations of discrete stochastic optimization problems [O] . Warren Powell, Andrzej Ruszczyński, Huseyin Topaloglu 2004

机译：离散随机优化问题的可分逼近的学习算法

Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning

摘要

著录项

相似文献

相关主题

期刊订阅