首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Convergence of Gradient Descent on Separable Data
【24h】

Convergence of Gradient Descent on Separable Data

机译:可分离数据上梯度下降的收敛性

获取原文
       

摘要

We provide a detailed study on the implicit bias of gradient descent when optimizing loss functions with strictly monotone tails, such as the logistic loss, over separable datasets. We look at two basic questions: (a) what are the conditions on the tail of the loss function under which gradient descent converges in the direction of the $L_2$ maximum-margin separator? (b) how does the rate of margin convergence depend on the tail of the loss function and the choice of the step size? We show that for a large family of super-polynomial tailed losses, gradient descent iterates on linear networks of any depth converge in the direction of $L_2$ maximum-margin solution, while this does not hold for losses with heavier tails. Within this family, for simple linear models we show that the optimal rates with fixed step size is indeed obtained for the commonly used exponentially tailed losses such as logistic loss. However, with a fixed step size the optimal convergence rate is extremely slow as $1/log(t)$, as also proved in Soudry et al (2018). For linear models with exponential loss, we further prove that the convergence rate could be improved to $log (t) /sqrt{t}$ by using aggressive step sizes that compensates for the rapidly vanishing gradients. Numerical results suggest this method might be useful for deep networks.
机译:我们对可分离数据集上具有严格单调尾部的损失函数(例如逻辑损失)进行优化时,提供了梯度下降的隐式偏差的详细研究。我们看两个基本问题:(a)损失函数尾部的哪些条件使梯度下降沿$ L_2 $最大余量分隔符的方向收敛? (b)保证金的收敛速度如何取决于损失函数的尾部和步长的选择?我们表明,对于一个大型的超多项式尾部损失族,在任何深度的线性网络上,梯度下降沿$ L_2 $最大余量解的方向收敛,而对于尾部较重的损失不成立。在这个族中,对于简单的线性模型,我们表明,对于常用的指数拖尾损失(例如逻辑损失),确实获得了具有固定步长的最佳速率。但是,在固定步长的情况下,最优收敛速度非常慢,为$ 1 / log(t)$,这在Soudry等人(2018)中也得到了证明。对于具有指数损失的线性模型,我们进一步证明,通过使用能补偿迅速消失的梯度的激进步长,可以将收敛速度提高到 log(t)/ sqrt {t} $。数值结果表明,该方法可能对深度网络有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号