首页> 外文期刊>JMLR: Workshop and Conference Proceedings >An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks
【24h】

An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks

机译:深度学习的最优控制方法及其在离散神经网络中的应用

获取原文
           

摘要

Deep learning is formulated as a discrete-time optimal control problem. This allows one to characterize necessary conditions for optimality and develop training algorithms that do not rely on gradients with respect to the trainable parameters. In particular, we introduce the discrete-time method of successive approximations (MSA), which is based on the Pontryagin’s maximum principle, for training neural networks. A rigorous error estimate for the discrete MSA is obtained, which sheds light on its dynamics and the means to stabilize the algorithm. The developed methods are applied to train, in a rather principled way, neural networks with weights that are constrained to take values in a discrete set. We obtain competitive performance and interestingly, very sparse weights in the case of ternary networks, which may be useful in model deployment in low-memory devices.
机译:深度学习被公式化为离散时间最优控制问题。这使人们可以表征最佳条件,并开发出不依赖于可训练参数梯度的训练算法。特别是,我们介绍了基于Pontryagin最大原理的离散时间逐次逼近法(MSA),用于训练神经网络。获得了针对离散MSA的严格误差估计,这阐明了其动态特性以及稳定算法的方法。所开发的方法以一种相当原则性的方式应用于训练神经网络,其权重被约束为采用离散集合中的值。我们获得了有竞争力的性能,并且有趣的是,在三元网络的情况下权重非常稀疏,这可能在低内存设备中的模型部署中很有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号