Kernel and Rich Regimes in Overparametrized Models

Blake Woodworth; Suriya Gunasekar; Jason D. Lee; Edward Moroshko; Pedro Savarese; Itay Golan; Daniel Soudry; Nathan Srebro

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Kernel and Rich Regimes in Overparametrized Models

【24h】

Kernel and Rich Regimes in Overparametrized Models

机译：过度分化模型中的内核和丰富的制度

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A recent line of work studies overparametrized neural networks in the “kernel regime,” i.e.?when during training the network behaves as a kernelized linear predictor, and thus, training with gradient descent has the effect of finding the corresponding minimum RKHS norm solution. This stands in contrast to other studies which demonstrate how gradient descent on overparametrized networks can induce rich implicit biases that are not RKHS norms. Building on an observation by citet{chizat2018note}, we show how the extbf{extit{scale of the initialization}} controls the transition between the “kernel” (aka lazy) and “rich” (aka active) regimes and affects generalization properties in multilayer homogeneous models. We provide a complete and detailed analysis for a family of simple depth-$D$ linear networks that exhibit an interesting and meaningful transition between the kernel and rich regimes, and highlight an interesting role for the emph{width} of the models. We further demonstrate this transition empirically for matrix factorization and multilayer non-linear networks.

机译：最近的工作研究所在“内核制度”中通过分化的神经网络进行了分化的神经网络，即在训练期间，网络行为作为核化线性预测器，因此，具有梯度下降的训练具有找到相应的最小RKHS规范解决方案的效果。这与其他研究形成鲜明对比，这些研究表明如何在过度分化的网络上的梯度下降如何诱导非RKHS规范的丰富隐含偏差。通过 citet {chizat2018note}的观察，我们展示了 textbf { textit {初始化}}}如何控制“内核”（aka lazy）和“富人”（AKA Active）制度之间的转换并影响多层均匀模型中的概括特性。我们为一个简单的深度为D $线性网络提供了完整和详细的分析，该网络在内核和丰富的制度之间表现出有趣和有意义的转换，并为模型的 emph {width}突出显示一个有趣的角色。我们进一步证明了矩阵分解和多层非线性网络的经验验证这一转变。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2020年第2010期|共39页
作者
Blake Woodworth; Suriya Gunasekar; Jason D. Lee; Edward Moroshko; Pedro Savarese; Itay Golan; Daniel Soudry; Nathan Srebro;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. DYNAMICAL EFFECTS OF OVERPARAMETRIZATION IN NONLINEAR MODELS [J] . Aguirre LA., Billings SA. Physica, D. Nonlinear phenomena . 1995,第1a2期

机译：非线性模型中超参数化的动力学效应
2. Persistency of excitation and overparametrization in model reference adaptive control [J] . Tao G., Ioannou P.A. IEEE Transactions on Automatic Control . 1990,第2期

机译：模型参考自适应控制中励磁和超参数的持久性
3. Localized kernel-based approximation for pricing financial options under regime switching jump diffusion model [J] . Mollapourasl Reza, Haghi Majid, Liu Ruihua Applied numerical mathematics . 2018,第DECa期

机译：政权切换跳跃扩散模型下基于局部核的定价金融期权定价
4. Minimum variance bounds for overparametrized models [C] . Pintelon, R., Schoukens, . 1995

机译：参数化模型的最小方差边界
5. The Szego Kernel of Certain Polynomial Models, and Heat kernel Estimates for Schroedinger Operators with Reverse Hoelder Potentials. [D] . Tinker, Michael. 2014

机译：某些多项式模型的Szego核以及具有反向Hoelder势的Schroedinger算子的热核估计。
6. Kernel Analysis Based on Dirichlet Processes Mixture Models [O] . Jinkai Tian, Peifeng Yan, Da Huang 2019

机译：基于Dirichlet的内核分析混合模型
7. Kernel Analysis Based on Dirichlet Processes Mixture Models [O] . Jinkai Tian, Peifeng Yan, Da Huang 2019

机译：基于Dirichlet的内核分析混合模型

Kernel and Rich Regimes in Overparametrized Models

摘要

著录项

相似文献

相关主题

期刊订阅