How to Escape Saddle Points Efficiently?

机译：如何有效逃避鞍点？

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Non-convex optimization is ubiquitous in modem machine learning applications. Gradient descent based algorithms (such as gradient descent or Nesterov's accelerated gradient descent) are widely used in practice since they have been observed to perform well on these problems. From a theoretical standpoint however, this is quite surprising since these nonconvex problems are teeming with highly suboptimal saddle points, and it is well known that gradient descent (and variants) can get stuck in such saddle points. A key question that arises in resolving this apparent paradox is whether gradient descent based algorithms escape saddle points where the Hessian has negative eigenvalues and converge to points where Hessian is positive semidefinite (such points are called second-order stationary points) efficiently. We answer this question in the affirmative by showing the following: (1) Perturbed gradient descent converges to second-order stationary points in a number of iterations which depends only poly-logarithmically on dimension (i.e., it is almost "dimension-free")- The convergence rate of this procedure matches the wellknown convergence rate of gradient descent to first-order stationary points, up to log factors, and (2) A variant of Nesterov's accelerated gradient descent converges to second-order stationary points at a faster rate than perturbed gradient descent. The key technical components behind these results are (1) understanding geometry around saddle points and (2) designing a potential function for Nesterov's accelerated gradient descent for non-convex problems.

机译：非凸优化在现代机器学习应用程序中无处不在。基于梯度下降的算法（例如梯度下降或Nesterov的加速梯度下降）在实践中得到了广泛使用，因为已观察到它们在这些问题上的表现很好。但是，从理论的角度来看，这是非常令人惊讶的，因为这些非凸问题在高度次优的鞍点上充斥着，众所周知，梯度下降（和变体）会卡在这些鞍点上。解决此明显矛盾的一个关键问题是，基于梯度下降的算法是否可以有效地避开Hessian具有负特征值的鞍点并收敛到Hessian为正半定点的点（此类点称为二阶固定点）。我们通过显示以下内容来肯定地回答这个问题：（1）扰动梯度下降在多次迭代中收敛到二阶固定点，这些迭代仅在对数上取决于维数（即，几乎是“无维数”） -此过程的收敛速度与众所周知的梯度下降到一阶固定点的收敛速度匹配，直到对数因子，并且（2）Nesterov的加速梯度下降的一种变体以比以下速度更快的速度收敛到第二阶固定点。扰动梯度下降。这些结果背后的关键技术要素是（1）了解鞍点周围的几何形状，以及（2）设计Nesterov的非凸问题的加速梯度下降的潜在函数。

著录项

来源
《International conference on pattern recognition and machine intelligence》|2019年|a12-a12|共1页
会议地点
作者
Praneelh Netrapalli;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. How to Escape Saddle Points Efficiently [J] . Chi Jin, Rong Ge, Praneeth Netrapalli, JMLR: Workshop and Conference Proceedings . 2017,第1期

机译：如何有效逃避鞍点
2. How to Escape Saddle Points Efficiently [J] . Chi Jin, Rong Ge, Praneeth Netrapalli, JMLR: Workshop and Conference Proceedings . 2017,第6期

机译：如何有效逃避鞍点
3. A Discrete Supramolecular Conglomerate Composed of Two Saddle-Distorted Zinc(II)-Phthalocyanine Complexes and a Doubly Protonated Porphyrin with Saddle Distortion Undergoing Efficient Photoinduced Electron Transfer [J] . Takahiko Kojima, Tatsuhiko Honda, Kei Ohkubo Angewandte Chemie . 2008,第35期

机译：离散的超分子团聚体，由两个马鞍形畸变的锌（II）-酞菁配合物和双质子化卟啉与马鞍形畸变组成，正进行有效的光诱导电子转移。
4. How to Escape Saddle Points Efficiently? [C] . Praneeth Netrapalli International conference on pattern recognition and machine intelligence . 2019

机译：如何有效逃避鞍点？
5. Efficient Reduced Basis Methods for Parameter-Dependent Saddle Point Problems [D] . Newsum, Craig J. 2019

机译：基于参数的鞍点问题的高效约简方法
6. Noise Helps Optimization Escape From Saddle Points in the Synaptic Plasticity [O] . Ying Fang, Zhaofei Yu, Feng Chen 2020

机译：噪音有助于优化从突触可塑性中的马鞍点逃逸
7. Inside Cover: A Discrete Supramolecular Conglomerate Composed of Two Saddle-Distorted Zinc(II)-Phthalocyanine Complexes and a Doubly Protonated Porphyrin with Saddle Distortion Undergoing Efficient Photoinduced Electron Transfer (Angew. Chem. Int. Ed. 35/2008) [O] . Takahiko Kojima, Tatsuhiko Honda, Kei Ohkubo, 2008

机译：内部盖子：离散的超分子砾岩由两个鞍形扭曲的锌（II） - 酞菁配合物组成，并且具有鞍畸形的双质子卟啉，具有高效的光抑制电子转移（Angew.Chem.Int.25/2008）

How to Escape Saddle Points Efficiently?

摘要

著录项

相似文献

相关主题

期刊订阅