Variance Reduction for Stochastic Gradient Optimization

机译：随机梯度优化的差异降低

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Stochastic gradient optimization is a class of widely used algorithms for training machine learning models. To optimize an objective, it uses the noisy gradient computed from the random data samples instead of the true gradient computed from the entire dataset. However, when the variance of the noisy gradient is large, the algorithm might spend much time bouncing around, leading to slower convergence and worse performance. In this paper, we develop a general approach of using control variate for variance reduction in stochastic gradient. Data statistics such as low-order moments (pre-computed or estimated online) is used to form the control variate. We demonstrate how to construct the control variate for two practical problems using stochastic gradient optimization. One is convex-the MAP estimation for logistic regression, and the other is non-convex-stochastic variational inference for latent Dirichlet allocation. On both problems, our approach shows faster convergence and better performance than the classical approach.

机译：随机梯度优化是一类广泛使用的训练机器学习模型的算法。为了优化目标，它使用从随机数据样本计算的噪声梯度而不是从整个数据集计算的真正梯度。但是，当嘈杂梯度的方差很大时，算法可能会花费很多时间弹跳，导致收敛较慢和更糟糕的性能。在本文中，我们开发了一种使用控制变化的一般方法，以进行随机梯度的方差减少。数据统计信息如低阶矩（预计或估计在线）用于形成控制变化。我们展示了如何构建控制器使用随机梯度优化的两个实际问题。一个是凸起的逻辑回归的地图估计，另一个是对潜在的Dirichlet分配的非凸性随机变分或。在这两次问题上，我们的方法显示出比经典方法更快的融合和更好的性能。

著录项

来源
《Annual conference on Neural Information Processing Systems》|2013年||共9页
会议地点
作者
Chong Wang; Xi Chen; Alex Smola; Eric P. Xing;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Stochastic gradient Hamiltonian Monte Carlo with variance reduction for Bayesian inference [J] . Li Zhize, Zhang Tianyi, Cheng Shuyu, Machine Learning . 2019,第8a9期

机译：贝叶斯推断具有方差约简的随机梯度哈密顿蒙特卡罗
2. Stochastic Conjugate Gradient Algorithm With Variance Reduction [J] . Jin Xiao-Bo, Zhang Xu-Yao, Huang Kaizhu, Neural Networks and Learning Systems, IEEE Transactions on . 2019,第5期

机译：减少方差的随机共轭梯度算法
3. Stochastic gradient Hamiltonian Monte Carlo with variance reduction for Bayesian inference [J] . Li Zhize, Zhang Tianyi, Cheng Shuyu, Machine Learning . 2019,第8a9期

机译：随机梯度汉密尔顿蒙特卡罗与贝叶斯推理的差异减少
4. Variance Reduction for Stochastic Gradient Optimization [C] . Chong Wang, Xi Chen, Alex Smola, Annual conference on Neural Information Processing Systems . 2013

机译：随机梯度优化的方差降低
5. Bias and Variance Reduction in Assessing Solution Quality for Stochastic Programs. [D] . Stockbridge, Rebecca. 2013

机译：评估随机程序的解决方案质量时的偏差和方差减少。
6. Variance Reduction in Stochastic Gradient Langevin Dynamics [O] . Avinava Dubey, Sashank J. Reddi, Barnabás Póczos, -1

机译：随机梯度Langevin动力学的方差减小
7. Variance Reduction for Dependent Sequences with Applications to Stochastic Gradient MCMC [O] . Denis Belomestny, Leonid Iosipoi, Eric Moulines, 2021

机译：依赖于应用到随机梯度MCMC的依赖序列的方差减少

Variance Reduction for Stochastic Gradient Optimization

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅