Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems. Whereas distributed estimation of sample mean statistics has been the subject of a good deal of attention, computation of U-statistics, relying on more expensive averaging over pairs of observations, is a less investigated area. Yet, such data functionals are essential to describe global properties of a statistical population, with important examples including Area Under the Curve, empirical variance, Gini mean difference and within-cluster point scatter. This paper proposes new synchronous and asynchronous randomized gossip algorithms which simultaneously propagate data across the network and maintain local estimates of the U-statistic of interest. We establish convergence rate bounds of O(1/t) and O(log t/t) for the synchronous and asynchronous cases respectively, where t is the number of iterations, with explicit data and network dependent terms. Beyond favorable comparisons in terms of rate analysis, numerical experiments provide empirical evidence the proposed algorithms surpasses the previously introduced approach.
展开▼
机译:对于网络中的分散估计而言,高效而强大的算法对于许多分布式系统至关重要。样本均值统计的分布式估计一直是引起广泛关注的主题,而依靠对成对的观测值进行更昂贵的平均计算得出的U统计量却是一个研究较少的领域。但是,此类数据功能对于描述统计总体的全局属性至关重要,其重要示例包括曲线下面积,经验方差,基尼均值差和集群内点分散。本文提出了一种新的同步和异步随机八卦算法,该算法可同时在网络上传播数据并维护感兴趣的U统计量的本地估计。我们分别针对同步和异步情况建立O(1 / t)和O(log t / t)的收敛速率边界,其中t是迭代次数,具有显式数据和网络相关项。除了在速率分析方面的有利比较之外,数值实验还提供了经验证据,表明所提出的算法优于先前引入的方法。
展开▼