Gossip, also known as epidemic dissemination, is becoming an increasingly popular technique in distributed systems. Yet, it has remained a partially open question: how robust are such protocols? We consider a natural extension of the random phone-call model (introduced by Karp et al. [1]), and we analyze two different notions of robustness: the ability to tolerate adaptive failures, and the ability to tolerate oblivious failures. For adaptive failures, we present a new gossip protocol, TrickleGossip, which achieves near-optimal O(n log~3 n) message complexity. To the best of our knowledge, this is the first epidemic-style protocol that can tolerate adaptive failures. We also show a direct relation between resilience and message complexity, demonstrating that gossip protocols which tolerate a large number of adaptive failures need to use a super-linear number of messages with high probability. For oblivious failures, we present a new gossip protocol, CoordinatedGossip, that achieves optimal O(n) message complexity. This protocol makes novel use of the universe reduction technique to limit the message complexity.
展开▼