首页> 外文会议>International Joint Conference on Neural Networks >Learning the Step-size Policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno Algorithm
【24h】

Learning the Step-size Policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno Algorithm

机译:学习有限内存Broyden-Fletcher-Goldfarb-Shanno算法的步长策略

获取原文
获取外文期刊封面目录资料

摘要

We consider the problem to learn a step-size policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm. This is a limited computational memory quasi-Newton method widely used for deterministic unconstrained optimization. However, L-BFGS is currently avoided in large-scale problems for requiring step sizes to be provided at each iteration. Current methodologies for the step size selection for L-BFGS use heuristic tuning of design parameters and massive re-evaluations of the objective function and gradient to find appropriate step-lengths. We propose a neural network architecture with local information of the current iterate as the input. The step-length policy is learned from data of similar optimization problems, avoids additional evaluations of the objective function, and guarantees that the output step remains inside a pre-defined interval. The corresponding training procedure is formulated as a stochastic optimization problem using the backpropagation through time algorithm. The performance of the proposed method is evaluated on the training of image classifiers for the MNIST database for handwritten digits and for CIFAR-10. The results show that the proposed algorithm outperforms heuristically tuned optimizers such as ADAM, RMSprop, L-BFGS with a backtracking line search, and L-BFGS with a constant step size. The numerical results also show that a learned policy can be used as a warm-start to train new policies for different problems after a few additional training steps, highlighting its potential use in multiple large-scale optimization problems.
机译:我们考虑的问题,学习一个步长策略的有限记忆Brydn弗莱彻Galdavb ShanNO(L- BFGS)算法。这是一种计算内存有限的拟牛顿法,广泛用于确定性无约束优化。然而,L-BFGS目前在大规模问题中被避免,因为需要在每次迭代中提供步长。目前,L-BFG的步长选择方法采用启发式调整设计参数,并对目标函数和梯度进行大量重新评估,以找到合适的步长。我们提出了一种以当前迭代的局部信息为输入的神经网络结构。步长策略从类似优化问题的数据中学习,避免了对目标函数的额外评估,并确保输出步长保持在预定义的区间内。利用时间反向传播算法将相应的训练过程描述为一个随机优化问题。在MNIST手写数字和CIFAR-10数据库的图像分类器训练中,对该方法的性能进行了评估。结果表明,该算法优于启发式优化算法,如ADAM、RMSprop、带回溯线搜索的L-BFGS和具有恒定步长的L-BFGS。数值结果还表明,在经过几个额外的训练步骤后,学习的策略可以作为针对不同问题训练新策略的热启动,突出了其在多个大规模优化问题中的潜在用途。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号