首页> 外文OA文献 >Convergence Rate of Incremental Gradient and Incremental Newton Methods
【2h】

Convergence Rate of Incremental Gradient and Incremental Newton Methods

机译:增量梯度和增量牛顿方法的收敛速度

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The incremental gradient method is a prominent algorithm for minimizing afinite sum of smooth convex functions, used in many contexts includinglarge-scale data processing applications and distributed optimization overnetworks. It is a first-order method that processes the functions one at a timebased on their gradient information. The incremental Newton method, on theother hand, is a second-order variant which exploits additionally the curvatureinformation of the underlying functions and can therefore be faster. In thispaper, we focus on the case when the objective function is strongly convex andpresent fast convergence results for the incremental gradient and incrementalNewton methods under the constant and diminishing stepsizes. For a decayingstepsize rule $lpha_k = Theta(1/k^s)$ with $s in (0,1]$, we show that thedistance of the IG iterates to the optimal solution converges at rate ${calO}(1/k^{s})$ (which translates into ${cal O}(1/k^{2s})$ rate in thesuboptimality of the objective value). For $s>1/2$, this improves the previous${cal O}(1/sqrt{k})$ results in distances obtained for the case whenfunctions are non-smooth. We show that to achieve the fastest ${cal O}(1/k)$rate, incremental gradient needs a stepsize that requires tuning to the strongconvexity parameter whereas the incremental Newton method does not. The resultsare based on viewing the incremental gradient method as a gradient descentmethod with gradient errors, devising efficient upper bounds for the gradienterror to derive inequalities that relate distances of the consecutive iteratesto the optimal solution and finally applying Chung's lemmas from the stochasticapproximation literature to these inequalities to determine their asymptoticbehavior. In addition, we construct examples to show tightness of our rateresults.
机译:增量梯度法是用于最小化的平滑凸函数afinite总和一个突出的算法,在许多情况下使用includinglarge大规模数据处理应用和分布式优化overnetworks。它是处理功能中的一个在一个时基上的梯度信息的一阶方法。增量牛顿法,在theother手,是一个二阶的变体,其另外利用的基本功能curvatureinformation,因此可以更快。在thispaper,我们专注于情况下,当目标函数是强凸andpresent为增量梯度和常量下incrementalNewton方法和递减stepsizes快速收敛的结果。对于decayingstepsize规则$ alpha_k = 西塔(1 / K ^ S)$与$ S 在(0,1] $,我们显示了IG的那thedistance迭代到最优解收敛于速率$ {卡洛} (1 / K ^ {S})$(这转化为$ { CAL O校}(1 / K ^ {2S})在物镜值的thesuboptimality $率)。对于$ S> 1/2 $,这改进以前的$ { CAL O校}(1 / 开方{K})在的情况下whenfunctions获得的距离$结果不光滑。我们发现,以达到最快的$ { CAL O校}(1 / K)$率,增加梯度需要,需要调谐到strongconvexity参数而增量牛顿方法的步长不。基于观看增量梯度方法与梯度误差的梯度descentmethod,制定高效的上限为gradienterror推导不平等的resultsare那涉及连续iteratesto的距离最优解,最后施加Chung的引理从stochasticapproximation文献这些不等式来确定它们asymptoti cbehavior。此外,我们构造例子来说明我们的rateresults的密封性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号