首页> 外文会议>International Joint Conference on Neural Networks >Hessian-based Bounds on Learning Rate for Gradient Descent Algorithms
【24h】

Hessian-based Bounds on Learning Rate for Gradient Descent Algorithms

机译:梯度下降算法的基于Hessian的学习率边界

获取原文

摘要

Learning rate is a crucial parameter governing the convergence rate of any learning algorithm. Most of the learning algorithms based on stochastic gradient descent (SGD) method depend on heuristic choice of learning rate. In this paper, we derive bounds on the learning rate of SGD based adaptive learning algorithms by analyzing the largest eigenvalue of the Hessian matrix from first principles. The proposed approach is analytical. To illustrate the efficacy of the analytical approach, we considered several high-dimensional data sets and compared the rate of convergence of error for the neural gas algorithm and showed that the proposed bounds on learning rate result in a faster rate of convergence than AdaDec, Adam, and AdaDelta approaches which require hyper-parameter tuning.
机译:学习率是管理任何学习算法的收敛速率的关键参数。基于随机梯度下降的大多数学习算法(SGD)方法取决于学习率的启发式选择。在本文中,我们通过分析来自第一原理的Hessian矩阵的最大特征值,从基于SGD自适应学习算法的学习率的界限。所提出的方法是分析的。为了说明分析方法的功效,我们考虑了几个高维数据集,并比较了神经气体算法的误差会聚速率,并显示了学习率的拟议界限比Adadec,adam更快地收敛速度以及需要超参数调谐的Adadelta方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号