首页> 外文会议>Conference on uncertainty in artificial intelligence >Annealed Gradient Descent for Deep Learning
【24h】

Annealed Gradient Descent for Deep Learning

机译:深度学习的退火梯度下降

获取原文

摘要

Stochastic gradient descent (SGD) has been regarded as a successful optimization algorithm in machine learning. In this paper, we propose a novel annealed gradient descent (AGD) method for non-convex optimization in deep learning. AGD optimizes a sequence of gradually improved smoother mosaic functions that approximate the original non-convex objective function according to an annealing schedule during the optimization process. We present a theoretical analysis on its convergence properties and learning speed. The proposed AGD algorithm is applied to learning deep neural networks (DNNs) for image recognition on MNIST and speech recognition on Switchboard. Experimental results have shown that AGD can yield comparable performance as SGD but it can significantly expedite training of DNNs in big data sets (by about 40% faster).
机译:随机梯度下降(SGD)被认为是机器学习中成功的优化算法。在本文中,我们提出了一种新颖的退火梯度下降(AGD)方法,用于深度学习中的非凸优化。 AGD根据优化过程中的退火计划,优化了逐渐改善的平滑镶嵌函数序列,这些序列近似于原始的非凸目标函数。我们对其收敛性和学习速度进行理论分析。所提出的AGD算法被应用于学习深度神经网络(DNN),以在MNIST上进行图像识别和在Switchboard上进行语音识别。实验结果表明,AGD可以提供与SGD相当的性能,但是它可以显着加快大数据集中DNN的训练速度(快40%)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号