Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

机译：谁害怕大坏的最小值？尖矩阵张量模型中梯度流分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics. We show that there is a well defined region of parameters where the gradient-flow algorithm finds a good global minimum despite the presence of exponentially many spurious local minima. We show that this is achieved by surfing on saddles that have strong negative direction towards the global minima, a phenomenon that is connected to a BBP-type threshold in the Hessian describing the critical points of the landscapes.

机译：基于梯度的算法对于许多机器学习任务是有效的，但尽管最近的努力和一些进展，但它往往仍然清楚为什么他们在实践中优化高维非凸函数以及为什么他们找到好的最小值而不是被困杂散的。在这里，我们提出了一种解释尖刺矩阵张量模型中这种行为的定量理论。我们的框架是基于静止点的KAC水稻分析和源自统计物理学的梯度流的闭合形式分析。我们表明，尽管存在呈指数尺寸的许多杂散局部最小值，但梯度流算法存在良好的全球最小值，因此存在良好的全球性。我们表明这是通过在朝向全球最小值具有强大负方向的鞍座上冲浪来实现这一目标，这是一种在Hessian中的BBP型阈值连接到描述景观的关键点的现象。

著录项

来源
《Conference on Neural Information Processing Systems》|2020年|p7960-8760|共11页
会议地点
作者
Stefano Sarao Mannelli; Giulio Biroli; Chiara Cammarota; Florent Krzakala; Lenka Zdeborova;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计量学;
关键词

相似文献

外文文献
中文文献
专利

1. The Structures of Optical Neural Nets Based on New Matrix-Tensor Equivalental Models (MTEMs) and Results of Modeling [J] . V. G. Krasilenko, A. I. Nikolskyy, J. A. Flavitskaya Optical memory & neural networks . 2010,第1期

机译：基于新的矩阵-张量等效模型（MTEM）的光学神经网络的结构和建模结果
2. Gradient Descent Learns One-hidden-layer CNN: Don’t be Afraid of Spurious Local Minima [J] . Simon Du, Jason Lee, Yuandong Tian, JMLR: Workshop and Conference Proceedings . 2018,第2010期

机译：梯度下降学习了一层隐藏的CNN：不要害怕虚假的局部最小值
3. Gradient Descent Learns One-hidden-layer CNN: Don’t be Afraid of Spurious Local Minima [J] . Simon Du, Jason Lee, Yuandong Tian, JMLR: Workshop and Conference Proceedings . 2018,第4期

机译：梯度下降学习了一层隐藏的CNN：不要害怕虚假的局部最小值
4. Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model [C] . Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Conference on Neural Information Processing Systems . 2020

机译：谁害怕大坏的最小值？尖矩阵张量模型中梯度流分析
5. Point process modeling and estimation: Advances in the analysis of dynamic neural spiking data [D] . Deng, Xinyi 2016

机译：点过程建模和估计：动态神经脉冲数据分析的进展
6. Modulation of a decision-making process by spatiotemporal spike patterns decoding: evidence from spike-train metrics analysis and spiking neural network modeling [O] . Laureline Logiaco, René Quilodran, Wulfram Gerstner, 2013

机译：时空峰值模式解码对决策过程的调制：峰值训练量度分析和峰值神经网络建模的证据
7. Do not be afraid of local minima: affine shaker and particle swarm [O] . Battiti Roberto, Brunato Mauro, Pasupuleti Srinivas 2005

机译：不要害怕局部最小值：仿射振荡器和粒子群

Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

摘要

著录项

相似文献

相关主题

期刊订阅