Distributed optimization algorithms are essential for training machinelearning models on very large-scale datasets. However, they often suffer fromcommunication bottlenecks. Confronting this issue, a communication-efficientprimal-dual coordinate ascent framework (CoCoA) and its improved variant CoCoA+have been proposed, achieving a convergence rate of $mathcal{O}(1/t)$ forsolving empirical risk minimization problems with Lipschitz continuous losses.In this paper, an accelerated variant of CoCoA+ is proposed and shown topossess a convergence rate of $mathcal{O}(1/t^2)$ in terms of reducingsuboptimality. The analysis of this rate is also notable in that theconvergence rate bounds involve constants that, except in extreme cases, aresignificantly reduced compared to those previously provided for CoCoA+. Theresults of numerical experiments are provided to show that acceleration canlead to significant performance gains.
展开▼