首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron
【24h】

Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron

机译:超参数化模型和加速感知器的SGD收敛速度越来越快

获取原文
           

摘要

Modern machine learning focuses on highly expressive models that are able to fit or interpolate the data completely, resulting in zero training loss. For such models, we show that the stochastic gradients of common loss functions satisfy a strong growth condition. Under this condition, we prove that constant step-size stochastic gradient descent (SGD) with Nesterov acceleration matches the convergence rate of the deterministic accelerated method for both convex and strongly-convex functions. We also show that this condition implies that SGD can find a first-order stationary point as efficiently as full gradient descent in non-convex settings. Under interpolation, we further show that all smooth loss functions with a finite-sum structure satisfy a weaker growth condition. Given this weaker condition, we prove that SGD with a constant step-size attains the deterministic convergence rate in both the strongly-convex and convex settings. Under additional assumptions, the above results enable us to prove an $O(1/k^2)$ mistake bound for $k$ iterations of a stochastic perceptron algorithm using the squared-hinge loss. Finally, we validate our theoretical findings with experiments on synthetic and real datasets.
机译:现代机器学习专注于高度表达的模型,这些模型能够完全拟合或内插数据,从而导致零培训损失。对于此类模型,我们表明常见损失函数的随机梯度满足强增长条件。在这种情况下,我们证明了具有Nesterov加速度的恒定步长随机梯度下降(SGD)与凸函数和强凸函数的确定性加速方法的收敛速度匹配。我们还表明,这种情况意味着SGD可以在非凸设置中找到与完全梯度下降一样有效的一阶固定点。在插值下,我们进一步表明,具有有限和结构的所有平滑损失函数都满足较弱的增长条件。鉴于这种较弱的条件,我们证明了具有恒定步长的SGD在强凸和凸凸设置下都可以达到确定的收敛速度。在另外的假设下,上述结果使我们能够证明使用平方铰链损失的随机感知器算法的$ k $迭代有$ O(1 / k ^ 2)$错误界限。最后,我们通过对合成数据集和真实数据集进行实验来验证我们的理论发现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号