首页> 外文会议>International Conference on Machine Learning >PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization
【24h】

PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization

机译:PA-GD:关于扰动交替梯度下降到结构性非凸化优化的二阶固定点的收敛性

获取原文

摘要

Alternating gradient descent (A-GD) is a simple but popular algorithm in machine learning, which updates two blocks of variables in an alternating manner using gradient descent steps. In this paper, we consider a smooth unconstrained nonconvex optimization problem, and propose a perturbed A-GD (PA-GD) which is able to converge (with high probability) to the second-order stationary points (SOSPs) with a global sublinear rate. Existing analysis on A-GD type algorithm either only guarantees convergence to first-order solutions, or converges to second-order solutions asymptotically (without rates). To the best of our knowledge, this is the first alternating type algorithm that takes O(polylog(d)/ε~2) iterations to achieve an (ε, √ε)-SOSP with high probability, where polylog(d) denotes the polynomial of the logarithm with respect to problem dimension d.
机译:交替梯度下降(A-GD)是机器学习中的简单但流行的算法,其使用梯度血缘步骤更新以交替方式更新两个变量块。在本文中,我们考虑了一个平稳的无约束非凸优化问题,并提出了一种具有全球载速率的二阶静止点(SOSP)的扰动A-GD(PA-GD),其能够将(具有高概率)收敛到二阶固定点(SOSP) 。对A-GD型算法的现有分析只能保证到一阶解决方案的融合,或者收敛到渐近的二阶解决方案(无速率)。据我们所知,这是第一种选择o(polylog(d)/ε〜2)迭代以实现具有高概率的(ε,√ε)-soSp的迭代,其中Polylog(d)表示关于问题尺寸D的对数的多项式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号