On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization

Sanjeev Arora; Nadav Cohen; Elad Hazan

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization

【24h】

On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization

机译：关于深度网络的优化：超参数隐式加速

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Conventional wisdom in deep learning states that increasing depth improves expressiveness but complicates optimization. This paper suggests that, sometimes, increasing depth can speed up optimization. The effect of depth on optimization is decoupled from expressiveness by focusing on settings where additional layers amount to overparameterization – linear neural networks, a well-studied model. Theoretical analysis, as well as experiments, show that here depth acts as a preconditioner which may accelerate convergence. Even on simple convex problems such as linear regression with $ell_p$ loss, $p>2$, gradient descent can benefit from transitioning to a non-convex overparameterized objective, more than it would from some common acceleration schemes. We also prove that it is mathematically impossible to obtain the acceleration effect of overparametrization via gradients of any regularizer.

机译：深度学习的传统观点认为，深度的增加可以提高表达能力，但会使优化复杂化。本文建议，有时，增加深度可以加快优化速度。深度对优化的影响与表达力无关，其关注点在于附加层过多导致过度参数化的设置-线性神经网络，一种经过充分研究的模型。理论分析和实验表明，这里的深度是可能加速收敛的前提。即使在简单的凸问题上，例如损失$ ell_p $的线性回归，$ p> 2 $，梯度下降也可以从过渡到非凸的过参数化目标中受益，比从某些常见的加速方案中受益更多。我们还证明，在数学上不可能通过任何正则化器的梯度来获得过参数化的加速效果。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第2009期|共10页
作者
Sanjeev Arora; Nadav Cohen; Elad Hazan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process [J] . Guy Blanc, Neha Gupta, Gregory Valiant, JMLR: Workshop and Conference Proceedings . 2020,第2010期

机译：由Ornstein-Uhlenbeck等流程驱动的深神经网络的隐含正规化
2. Implicit and Explicit Concept Relations in Deep Neural Networks for Multi-Label Video/Image Annotation [J] . Markatopoulou Foteini, Mezaris Vasileios, Patras Ioannis IEEE Transactions on Circuits and Systems for Video Technology . 2019,第6期

机译：深度神经网络中用于多标签视频/图像注释的隐式和显式概念关系
3. Implicit and Explicit Concept Relations in Deep Neural Networks for Multi-Label Video/Image Annotation [J] . Markatopoulou Foteini, Mezaris Vasileios, Patras Ioannis IEEE Transactions on Circuits and Systems for Video Technology . 2019,第6期

机译：多标签视频/图像注释深神经网络中的隐含和明确概念关系
4. Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks [C] . Spencer Frei, Yuan Cao, Quanquan Gu Conference on Neural Information Processing Systems . 2020

机译：算法依赖泛化界限，用于过度分辨率的深度剩余网络
5. Emerging Opportunities in Machine Learning Hardware Acceleration: From Advanced Neural Networks Implementation to Ultra-efficient Deep Learning Framework Using Next Generation Technology [D] . ?Cai, Ruizhe 2020

机译：机器学习硬件加速的新兴机会：从先进的神经网络实现，使用下一代技术实现超高效的深度学习框架
6. Gait-Based Identification Using Deep Recurrent Neural Networks and Acceleration Patterns [O] . Angel Peinado-Contreras, Mario Munoz-Organero 2020

机译：基于步态的识别使用深惯性神经网络和加速度模式
7. Accelerating hessianfree optimization for deep neural networks by implicit preconditioning and sampling [O] . Tara N. Sainath, Lior Horesh, Brian Kingsbury, 2016

机译：通过隐式预处理和采样加速深度神经网络的hessianfree优化

On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization

摘要

著录项

相似文献

相关主题

期刊订阅