Emerging distributed systems have a dynamic structure that is self-defined at any time by entities that autonomously decide to locally run the same distributed application. As extreme, a distributed system might thus also cease its existence when no entity is currently active while at some later moment new entities arrive and connect to each other to form again the system. Therefore the set of entities that over time might form the system is potentially infinite. This infinite arrival model is the key distinguishing factor between dynamic systems and traditional distributed systems where, on the contrary, the set of system entities is fixed since the deployment of application components is controlled and managed. In this model not only the set of correct processes is not known in advance, but no finite set containing the set of correct processes is known in advance. This actually is a higher level of uncertainty to be mastered in dynamic systems. This makes the study of possible implementations of failure detectors, as Ω, of paramount importance and at the same time makes the problem of realizing such failure detector far from being trivial. The uncertainty posed by infinite arrival models brings to two different issues (1) discovering the finite set of processes currently running and (2) dealing with a possible infinite set of non-correct processes that may wake up at any time, covering with their up times the whole computation. By assuming only temporary partitions, each process can witness its presence in the system by periodically sending a heartbeat and its identifier, to let other up processes know it and consider it as part of the system. At first glance, it may seem that discovering the finite set of processes currently running is the hard part of the problem, herein solved by assumption, and that among this set it is possible to eventually select a unique leader following well-known solutions employed in the crash-failure model. However, in the crash-failure model the objective is to assure that any correct process is eventually able to univocally select one correct process among an initial set containing a finite number of non-correct processes. By eventually suspecting as crashed those processes whose heartbeats/messages stops to arrive, all the complexity lies in avoiding to falsely suspect at least one correct process infinitely often. Solving this issue means that eventually and permanently at least one correct process will be considered as alive. If more than one correct process is considered as alive, processes can be totally and locally ordered at each process sorting their identifiers. By establishing this total order it is possible to apply at each process a local deterministic rule that independently chooses the same process (e.g. the one with the lowest identifier) as leader. In contrast to this crash-failure model with a finite set of processes, the infinite arrival model implies that heartbeats may arrive from correct and non-correct processes over the entire computation. A list of alive processes built by sorting process identifiers will continually include correct processes and up but non-correct processes since for any identifier assigned to a correct process, an infinite set of non-correct processes with a lower identifier or higher identifier (e.g. processes running on nodes with a lower, respectively higher, IP address) may arrive over the entire computation. Without a selection of the only set of correct processes, any choice on the flat mixed set may lead to elect a non-correct leader infinitely often, violating the specification of Ω.
展开▼