首页> 外文会议>International Conference on Machine Learning >Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime

Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime




Deep neural networks can achieve remarkable generalization performances while interpolating the training data; rather than the U-curve emblematic of the bias-variance trade-off, their test error often follows a "double descent" curve - a mark of the beneficial role of overparametrization. In this work, we develop a quantitative theory for this phenomenon in the context of highdimensional random features regression. We obtain a precise asymptotic expression for the biasvariance decomposition of the test error, and show that the bias displays a phase transition at the interpolation threshold, beyond it which it remains constant. We disentangle the variances stemming from the sampling of the dataset, from the additive noise corrupting the labels, and from the initialization of the weights. Following up on (Geiger et al., 2019a), we demonstrate that the latter two contributions are the crux of the double descent: they lead to the overfitting peak at the interpolation threshold and to the decay of the test error upon overparametrization. We quantify how they are suppressed by averaging the outputs of independently initialized estimators, and compare this ensembling procedure with overparametrization and regularization. Finally, we present numerical experiments on a standard deep learning setup to show that our results are relevant to the lazy regime of deep neural networks.



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号