首页> 外文会议>Annual conference on Neural Information Processing Systems >Variance Reduction for Stochastic Gradient Optimization
【24h】

Variance Reduction for Stochastic Gradient Optimization

机译:随机梯度优化的差异降低

获取原文
获取外文期刊封面目录资料

摘要

Stochastic gradient optimization is a class of widely used algorithms for training machine learning models. To optimize an objective, it uses the noisy gradient computed from the random data samples instead of the true gradient computed from the entire dataset. However, when the variance of the noisy gradient is large, the algorithm might spend much time bouncing around, leading to slower convergence and worse performance. In this paper, we develop a general approach of using control variate for variance reduction in stochastic gradient. Data statistics such as low-order moments (pre-computed or estimated online) is used to form the control variate. We demonstrate how to construct the control variate for two practical problems using stochastic gradient optimization. One is convex-the MAP estimation for logistic regression, and the other is non-convex-stochastic variational inference for latent Dirichlet allocation. On both problems, our approach shows faster convergence and better performance than the classical approach.
机译:随机梯度优化是一类广泛使用的训练机器学习模型的算法。为了优化目标,它使用从随机数据样本计算的噪声梯度而不是从整个数据集计算的真正梯度。但是,当嘈杂梯度的方差很大时,算法可能会花费很多时间弹跳,导致收敛较慢和更糟糕的性能。在本文中,我们开发了一种使用控制变化的一般方法,以进行随机梯度的方差减少。数据统计信息如低阶矩(预计或估计在线)用于形成控制变化。我们展示了如何构建控制器使用随机梯度优化的两个实际问题。一个是凸起的逻辑回归的地图估计,另一个是对潜在的Dirichlet分配的非凸性随机变分或。在这两次问题上,我们的方法显示出比经典方法更快的融合和更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号