This paper presents performance-oriented refinements and distributed implementation of a reconfigurable linearizable data service for read/write atomic objects. This service is based on the work of Lynch and Shvartsman, and it guarantees consistency under dynamic conditions involving asynchrony, message loss, and node arrivals, departures, and failures. To achieve fault tolerance and availability the service replicates objects at several dynamically, changeable network nodes, to which we refer as owners. All-to-all gossip protocol is used to keep replicas up to date and to maintain the list of the owners. However, when gossip is unconstrained and communication bandwidth is limited, network congestion may degrade system's performance. Moreover, we identify a problem where under certain scenarios read/write operations may become delayed or blocked. This paper introduces a more practical algorithm that introduces two refinements. First, we reduce communication cost by restricting the all-to-all gossip pattern to replica owners, based on the local decisions of the participating nodes. In this setting we analyze the latency of read/write operations. Second, we present a solution that allows blocked (or delayed) operations to resume processing and complete successfully. We restate the conditional analysis accordingly. Finally, we engineered a complete distributed system implementing this service and we present empirical results that illustrate the advantages of our approach.
展开▼