首页> 外文会议>International Joint Conference on Neural Networks >Towards Quantifying Intrinsic Generalization of Deep ReLU Networks
【24h】

Towards Quantifying Intrinsic Generalization of Deep ReLU Networks

机译:深入量化Deep ReLU网络的内在泛化

获取原文

摘要

Understanding the underlying mechanisms that enable the empirical successes of deep neural networks is essential for further improving their performance and explaining such networks. Towards this goal, a specific question is how to explain the "surprising" behavior of the same over-parametrized deep neural networks that can generalize well on real datasets and at the same time "memorize" training samples when the labels are randomized. In this paper, we demonstrate that deep ReLU networks generalize from training samples to new points via piece-wise linear interpolation. We provide a quantified analysis on the generalization ability of a deep ReLU network: Given a fixed point x and a fixed direction in the input space $mathcal{S}$, there is always a segment such that any point on the segment will be classified the same as the fixed point x. We call this segment the generalization interval. We show that the generalization intervals of a ReLU network behave similarly along pairwise directions between samples of the same label in both real and random cases on the MNIST and CIFAR-10 datasets. This result suggests that the same interpolation mechanism is used in both cases. Additionally, for datasets using real labels, such networks provide a good approximation of the underlying manifold in the data, where the changes are much smaller along tangent directions than along normal directions. Our systematic experiments demonstrate for the first time that such deep neural networks generalize through the same interpolation and explain the differences between their performance on datasets with real and random labels.
机译:了解使深度神经网络获得成功经验的基本机制对于进一步提高其性能并解释此类网络至关重要。朝着这个目标迈进的一个具体问题是,如何解释相同的过参数化深度神经网络的“令人惊讶”的行为,这些行为可以很好地推广到实际数据集,同时在标签被随机化时同时“记忆”训练样本。在本文中,我们证明了深度ReLU网络可通过分段线性插值将训练样本推广到新点。我们对深度ReLU网络的泛化能力进行了定量分析:给定输入空间$ \ mathcal {S} $中的固定点x和固定方向,总有一个线段,使得该线段上的任何点都将是分类与定点x相同。我们称此段为广义区间。我们显示,在MNIST和CIFAR-10数据集的真实和随机情况下,ReLU网络的泛化间隔在相同标签的样本之间沿成对方向表现相似。该结果表明在两种情况下都使用相同的插值机制。另外,对于使用实标签的数据集,此类网络提供了数据中基础流形的良好近似,其中沿切线方向的变化比沿法线方向的变化小得多。我们的系统实验首次证明了这种深度神经网络通过相同的插值进行泛化,并解释了它们在带有实标记和随机标记的数据集上的性能差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号