首页> 外文会议>Workshop on Automatic Speech Recognition and Understanding >ACCELERATING HESSIAN-FREE OPTIMIZATION FOR DEEP NEURAL NETWORKS BY IMPLICIT PRECONDITIONING AND SAMPLING
【24h】

ACCELERATING HESSIAN-FREE OPTIMIZATION FOR DEEP NEURAL NETWORKS BY IMPLICIT PRECONDITIONING AND SAMPLING

机译:通过隐含的预处理和抽样加速对深神经网络的无Hessian的优化

获取原文

摘要

Hessian-free training has become a popular parallel second order optimization technique for Deep Neural Network training. This study aims at speeding up Hessian-free training, both by means of decreasing the amount of data used for training, as well as through reduction of the number of Krylov subspace solver iterations used for implicit estimation of the Hessian. In this paper, we develop an L-BFGS based preconditioning scheme that avoids the need to access the Hessian explicitly. Since L-BFGS cannot be regarded as a fixed-point iteration, we further propose the employment of flexible Krylov subspace solvers that retain the desired theoretical convergence guarantees of their conventional counterparts. Second, we propose a new sampling algorithm, which geometrically increases the amount of data utilized for gradient and Krylov subspace iteration calculations. On a 50-hr English Broadcast News task, we find that these methodologies provide roughly a 1.5x speed-up, whereas, on a 300-hr Switchboard task, these techniques provide over a 2.3x speedup, with no loss in WER. These results suggest that even further speed-up is expected, as problems scale and complexity grows.
机译:Hessian-Free培训已成为深度神经网络培训的流行并行二阶优化技术。本研究旨在加速Hessian的无培训培训,无论是降低用于培训的数据量,还可以通过减少用于隐含粗麻布估计的Krylov子空间溶剂迭代的数量。在本文中,我们开发了基于L-BFGS的预处理方案,避免了明确访问Hessian的必要性。由于L-BFG不能被视为定点迭代,我们进一步提出了柔性Krylov子空间溶剂,该载体保留了其常规对应物的所需理论收敛保证。其次,我们提出了一种新的采样算法,其几何上增加了用于梯度和Krylov子空间迭代计算的数据量。在50小时英语广播新闻任务中,我们发现这些方法提供了大约1.5倍的加速,而在300小时的交换机纸板任务上,这些技术提供了2.3倍的加速,不含WER损失。这些结果表明,预期进一步加速,因为问题规模和复杂性增长。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号