Learning the Step-size Policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno Algorithm

机译：学习有限内存Broyden-Fletcher-Goldfarb-Shanno算法的步长策略

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We consider the problem to learn a step-size policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm. This is a limited computational memory quasi-Newton method widely used for deterministic unconstrained optimization. However, L-BFGS is currently avoided in large-scale problems for requiring step sizes to be provided at each iteration. Current methodologies for the step size selection for L-BFGS use heuristic tuning of design parameters and massive re-evaluations of the objective function and gradient to find appropriate step-lengths. We propose a neural network architecture with local information of the current iterate as the input. The step-length policy is learned from data of similar optimization problems, avoids additional evaluations of the objective function, and guarantees that the output step remains inside a pre-defined interval. The corresponding training procedure is formulated as a stochastic optimization problem using the backpropagation through time algorithm. The performance of the proposed method is evaluated on the training of image classifiers for the MNIST database for handwritten digits and for CIFAR-10. The results show that the proposed algorithm outperforms heuristically tuned optimizers such as ADAM, RMSprop, L-BFGS with a backtracking line search, and L-BFGS with a constant step size. The numerical results also show that a learned policy can be used as a warm-start to train new policies for different problems after a few additional training steps, highlighting its potential use in multiple large-scale optimization problems.

机译：我们考虑的问题，学习一个步长策略的有限记忆Brydn弗莱彻Galdavb ShanNO（L- BFGS）算法。这是一种计算内存有限的拟牛顿法，广泛用于确定性无约束优化。然而，L-BFGS目前在大规模问题中被避免，因为需要在每次迭代中提供步长。目前，L-BFG的步长选择方法采用启发式调整设计参数，并对目标函数和梯度进行大量重新评估，以找到合适的步长。我们提出了一种以当前迭代的局部信息为输入的神经网络结构。步长策略从类似优化问题的数据中学习，避免了对目标函数的额外评估，并确保输出步长保持在预定义的区间内。利用时间反向传播算法将相应的训练过程描述为一个随机优化问题。在MNIST手写数字和CIFAR-10数据库的图像分类器训练中，对该方法的性能进行了评估。结果表明，该算法优于启发式优化算法，如ADAM、RMSprop、带回溯线搜索的L-BFGS和具有恒定步长的L-BFGS。数值结果还表明，在经过几个额外的训练步骤后，学习的策略可以作为针对不同问题训练新策略的热启动，突出了其在多个大规模优化问题中的潜在用途。

著录项

来源
《International Joint Conference on Neural Networks》|2021年|1-8|共8页
会议地点
作者
Lucas N. Egidio; Anders Hansson; Bo Wahlberg;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Backtracking; Machine learning algorithms; Databases; Heuristic algorithms; Neural networks; Linear programming;

机译：训练回溯;机器学习算法;数据库;启发式算法;神经网络;线性规划;

相似文献

外文文献
中文文献
专利

1. An aero-engine life-cycle maintenance policy optimization algorithm: Reinforcement learning approach [J] . Zhen LI, Shisheng ZHONG, Lin LIN 中国航空学报（英文版） . 2019,第009期
2. Variable step-size affine projection algorithm based on global speech absence probability for adaptive feedback cancellation [J] . KIM Young-Sear, SONG Ji-hyun, KIM Sang-Kyun, 中南大学学报（英文版） . 2014,第002期
3. Adaptive step-size modified fractional least mean square algorithm for chaotic time series prediction [J] . Bilal Shoaib, Ijaz Mansoor Qureshi, Shafqatullah, 中国物理：英文版 . 2014,第005期
4. Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes [J] . Vladislav B. Tadic Machine Learning . 2006,第2期

机译：具有恒定步长的时差学习算法的渐近分析
5. An Algorithm of Policy Gradient Reinforcement Learning with a Fuzzy Controller in Policies [J] . Harukazu Igarashi, Seiji Ishihara International Journal of Artificial Intelligence and Expert Systems (IJAE) . 2013,第1期

机译：策略中带有模糊控制器的策略梯度强化学习算法
6. Dynamic regret convergence analysis and an adaptive regularization algorithm for on-policy robot imitation learning [J] . Jonathan N. Lee, Michael Laskey, Ajay Kumar Tanwani, The International journal of robotics research . 2021,第10a11期

机译：动态遗憾收敛分析与对政策机器人模仿学习的自适应正规化算法
7. An Improved Learning Algorithm Based on The Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method For Back Propagation Neural Networks [C] . Nazri Mohd Nawi, Meghana R. Ransing, Rajesh S. Ransing International Conference on Intelligent Systems Design and Applications . 2006

机译：一种改进的基于Broyden-Fletcher-Goldfarb-Shanno（BFGS）方法的改进的学习算法回到传播神经网络的方法
8. A Study on Learning Algorithms of Value and Policy Functions in Hex [D] . Takada, Kei 2019

机译：十六进制值与策略函数学习算法的研究
9. A Limited-Memory BFGS Algorithm Based on a Trust-Region Quadratic Model for Large-Scale Nonlinear Equations [O] . Yong Li, Gonglin Yuan, Zengxin Wei -1

机译：基于信赖域二次模型的大型非线性方程组有限内存BFGS算法
10. Incremental policy learning: An equilibrium selection algorithm for reinforcement learning agents with common interests [O] . Nancy Fulda, Dan Ventura 2004

机译：增量策略学习：具有共同兴趣的强化学习代理的均衡选择算法
11. Algorithms for the Equilibration of Matrices and Their Application to Limited-Memory Quasi-Newton Methods [R] . Bradley, A. M. 2010

机译：矩阵平衡算法及其在有限记忆拟牛顿法中的应用

Learning the Step-size Policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno Algorithm

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅