The Goldilocks Zone: Towards Better Understanding of Neural Network Loss Landscapes

机译：金发姑娘区：更好地了解神经网络损失景观

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We explore the loss landscape of fully-connected and convolutional neural networks using random, low-dimensional hyperplanes and hyperspheres. Evaluating the Hessian, H, of the loss function on these hypersurfaces, we observe 1) an unusual excess of the number of positive eigenvalues of H, and 2) a large value of Tr(H)/||H| | at a well defined range of configuration space radii, corresponding to a thick, hollow, spherical shell we refer to as the Goldilocks zone. We observe this effect for fully-connected neural networks over a range of network widths and depths on MNIST and CIFAR-10 datasets with the ReLU and tanh non-linearities, and a similar effect for convolutional networks. Using our observations, we demonstrate a close connection between the Goldilocks zone, measures of local convexity/prevalence of positive curvature, and the suitability of a network initialization. We show that the high and stable accuracy reached when optimizing on random, low-dimensional hypersurfaces is directly related to the overlap between the hypersurface and the Goldilocks zone, and as a corollary demonstrate that the notion of intrinsic dimension is initialization-dependent. We note that common initialization techniques initialize neural networks in this particular region of unusually high convexity/prevalence of positive curvature, and offer a geometric intuition for their success. Furthermore, we demonstrate that initializing a neural network at a number of points and selecting for high measures of local convexity such as Tr(H)/||H | |, number of positive eigenvalues of H, or low initial loss, leads to statistically significantly faster training on MNIST. Based on our observations, we hypothesize that the Goldilocks zone contains an unusually high density of suitable initialization configurations.

机译：我们使用随机，低维超平板和超球探索完全连接和卷积神经网络的损失景观。评估Hessian，H，对这些过度缺陷的损失功能，我们观察到1）一种不寻常的过量超量的阳性特征数，而2）TR（H）/ || H | |在一个明确的配置空间半径范围内，对应于厚，空心的球形壳，我们称为金发姑娘区。我们在带有Relu和Tanh非线性的MNIST和CIFAR-10数据集的一系列网络宽度和深度上遵守完全连接的神经网络的这种效果，以及对卷积网络的类似效果。使用我们的观察，我们展示了金发姑娘区，局部凸起/阳性曲率普及的措施之间的密切连接，以及网络初始化的适用性。我们表明，在随机优化时达到的高稳定的精度与超出表面和金发姑娘区之间的重叠直接相关，并且作为必然表明，内在尺寸的概念依赖于初始化。我们注意到，普通初始化技术初始化该特定区域的神经网络在积极曲率的异常高/普及的特定区域中，并为其成功提供几何直觉。此外，我们证明了在许多点处初始化神经网络，并选择局部凸起的高测量诸如TR（H）/ || H | |，H的正面值数或低初始损失的数量，导致统计上明显更快地训练MNIST。根据我们的观察，我们假设金发姑娘区包含异常高密度的合适初始化配置。

著录项

来源
《AAAI Conference on Artificial Intelligence》|2019年|3125-3613p|共8页
会议地点
作者
Stanislav Fort; Adam Scherlis;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Understanding the loss landscape of one-hidden-layer ReLU networks [J] . Liu Bo Knowledge-Based Systems . 2021,第MAYa23期

机译：了解一个隐藏层Relu网络的损失景观
2. Jamming transition as a paradigm to understand the loss landscape of deep neural networks [J] . Mario Geiger, Stefano Spigler, Stéphane d’Ascoli, Physical review, E . 2019,第1aPta1期

机译：干扰过渡作为理解深神经网络的损失景观的范例
3. Understanding the Loss Surface of Neural Networks for Binary Classification [J] . SHIYU LIANG, Ruoyu Sun, Yixuan Li, JMLR: Workshop and Conference Proceedings . 2018,第12期

机译：了解用于二分类的神经网络的损失面
4. The Goldilocks Zone: Towards Better Understanding of Neural Network Loss Landscapes [C] . Stanislav Fort, Adam Scherlis AAAI Conference on Artificial Intelligence . 2019

机译：金发姑娘区：更好地了解神经网络损失景观
5. Neural network applications of seismic attributes for predicting porosity and production and mapping fault zones. [D] . Celik, Ufuk. 2012

机译：地震属性的神经网络在预测孔隙度和产量以及绘制断层图上的应用。
6. The Goldilocks zone in neural circuits [O] . Mark D Humphries 2016

机译：神经回路中的Goldilocks区
7. The Goldilocks Zone: Towards Better Understanding of Neural Network Loss Landscapes [O] . Stanislav Fort, Adam Scherlis 2019

机译：金发姑娘区：更好地了解神经网络损失景观

The Goldilocks Zone: Towards Better Understanding of Neural Network Loss Landscapes

摘要

著录项

相似文献

相关主题

期刊订阅