首页> 外文会议>Annual conference on Neural Information Processing Systems >Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning
【24h】

Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning

机译:机器学习随机逼近算法的非渐近分析

获取原文

摘要

We consider the minimization of a convex objective function defined on a Hilbert space, which is only available through unbiased estimates of its gradients. This problem includes standard machine learning algorithms such as kernel logistic regression and least-squares regression, and is commonly referred to as a stochastic approximation problem in the operations research community. We provide a non-asymptotic analysis of the convergence of two well-known algorithms, stochastic gradient descent (a.k.a. Robbins-Monro algorithm) as well as a simple modification where iterates are averaged (a.k.a. Polyak-Ruppert averaging). Our analysis suggests that a learning rate proportional to the inverse of the number of iterations, while leading to the optimal convergence rate in the strongly convex case, is not robust to the lack of strong convexity or the setting of the proportionality constant. This situation is remedied when using slower decays together with averaging, robustly leading to the optimal rate of convergence. We illustrate our theoretical results with simulations on synthetic and standard datasets.
机译:我们考虑在希尔伯特空间上定义的凸目标函数的最小化,这只能通过对其梯度的无偏估计来获得。这个问题包括标准的机器学习算法,例如内核逻辑回归和最小二乘回归,在运筹学界通常被称为随机近似问题。我们提供了两种著名算法的收敛性的非渐近分析,即随机梯度下降法(又称Robbins-Monro算法)以及对迭代进行平均的简单修改(又称Polyak-Ruppert平均)。我们的分析表明,与迭代次数成反比的学习速率,虽然在强凸情况下会导致最佳收敛速度,但对于缺乏强凸性或比例常数的设置并不稳健。当使用更慢的衰减并求平均值时,可以纠正这种情况,从而稳健地导致最佳收敛速度。我们通过对合成数据集和标准数据集进行仿真来说明我们的理论结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号