首页> 外文期刊>JMLR: Workshop and Conference Proceedings >On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
【24h】

On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization

机译:关于深度网络的优化:超参数隐式加速

获取原文
           

摘要

Conventional wisdom in deep learning states that increasing depth improves expressiveness but complicates optimization. This paper suggests that, sometimes, increasing depth can speed up optimization. The effect of depth on optimization is decoupled from expressiveness by focusing on settings where additional layers amount to overparameterization – linear neural networks, a well-studied model. Theoretical analysis, as well as experiments, show that here depth acts as a preconditioner which may accelerate convergence. Even on simple convex problems such as linear regression with $ell_p$ loss, $p>2$, gradient descent can benefit from transitioning to a non-convex overparameterized objective, more than it would from some common acceleration schemes. We also prove that it is mathematically impossible to obtain the acceleration effect of overparametrization via gradients of any regularizer.
机译:深度学习的传统观点认为,深度的增加可以提高表达能力,但会使优化复杂化。本文建议,有时,增加深度可以加快优化速度。深度对优化的影响与表达力无关,其关注点在于附加层过多导致过度参数化的设置-线性神经网络,一种经过充分研究的模型。理论分析和实验表明,这里的深度是可能加速收敛的前提。即使在简单的凸问题上,例如损失$ ell_p $的线性回归,$ p> 2 $,梯度下降也可以从过渡到非凸的过参数化目标中受益,比从某些常见的加速方案中受益更多。我们还证明,在数学上不可能通过任何正则化器的梯度来获得过参数化的加速效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号