首页> 外文期刊>Neural computation >Why Does Large Batch Training Result in Poor Generalization? A Comprehensive Explanation and a Better Strategy from the Viewpoint of Stochastic Optimization
【24h】

Why Does Large Batch Training Result in Poor Generalization? A Comprehensive Explanation and a Better Strategy from the Viewpoint of Stochastic Optimization

机译:为什么大批量培训会导致泛化能力差?随机优化观点的综合解释和更好的策略

获取原文
获取原文并翻译 | 示例

摘要

We present a comprehensive framework of search methods, such as simulated annealing and batch training, for solving nonconvex optimization problems. These methods search a wider range by gradually decreasing the randomness added to the standard gradient descent method. The formulation that we define on the basis of this framework can be directly applied to neural network training. This produces an effective approach that gradually increases batch size during training. We also explain why large batch training degrades generalization performance, which previous studies have not clarified.
机译:我们提出了一个搜索方法的综合框架,例如模拟退火和批处理训练,以解决非凸优化问题。这些方法通过逐渐减小添加到标准梯度下降方法中的随机性来搜索更大的范围。我们在此框架的基础上定义的公式可以直接应用于神经网络训练。这产生了一种有效的方法,可以在训练过程中逐渐增加批量大小。我们还解释了为什么大批量训练会降低泛化性能,而先前的研究尚未阐明。

著录项

  • 来源
    《Neural computation》 |2018年第7期|2005-2023|共19页
  • 作者单位

    Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Hokkaido 060-0814, Japan;

    Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Hokkaido 060-0814, Japan;

    Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Hokkaido 060-0814, Japan;

  • 收录信息 美国《科学引文索引》(SCI);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号