This thesis develops and analyzes distributed algorithms for convex optimization in networks, when nodes cooperatively minimize the sum of their locally known costs subject to a global variable of common interest. This setup encompasses very relevant applications in networked systems, including distributed estimation and source localization in sensor networks, and distributed learning. Generally, existing literature offers two types of distributed algorithms to solve the above problem: 1) distributed (consensus-based) gradient methods; and 2) distributed augmented Lagrangian methods; but both types present several limitations. 1) Distributed gradient-like methods have slow practical convergence rate; further, they are usually studied for very general, non-differentiable costs, and the possibilities for speed-ups on more structured functions are lot sufficiently explored. 2) Distributed augmented Lagrangian methods generally show good performance n practice, but there is a limited understanding of their convergence rates, specially how the rates depend in the underlying network.;This thesis contributes to both classes of algorithms in several ways. We propose a new class of fast distributed gradient algorithms that are Nesterov-like. We achieve this by exploiting the structure of convex, differeniable costs with Lipschitz continuous and bounded gradients. We establish their fast convergence rates in terms of the number of per-node communications, per-node gradient evaluations, and the network spectral gap. Furthermore, we show that current distributed gradient methods cannot achieve the rates of our methods under the same function classes. Our distributed Nesterov-like gradient algorithms achieve guaranteed rates for both static and random networks, including the scenario with intermittently failing inks or randomized communication protocols. With respect to distributed augmented Lagrangian methods, ye consider both deterministic and randomized distributed methods, subsuming known methods but also introducing novel algorithms. Assuming twice continuously differentiable costs with a bounded Hessian, ye establish global linear convergence rates, in terms of the number of per-node communications, and, unlike most of the existing work, in terms of the network spectral gap. We illustrate our methods with reveal applications in sensor networks and distributed learning.
展开▼