This paper discusses the issue of providing tolerance to hardware and software faults in distributed computing environments as well as issues related to efficiency and flexibility. A set of new fault-tolerant architectures is presented, and a detailed dependability analysis of these architectures is performed together with an efficiency and response time evaluation. The proposed architectural solutions are designed mainly for general-purpose distributed computing systems where many unrelated applications could compete for both hardware and software resources, thereby exhibiting highly varying and dynamic system characteristics. Stress is thus placed on adaptation - a major feature of the architectures under consideration is to attempt the adaptive execution of redundant components so as to minimize hardware resource consumption and shorten the response time, as much as possible, for a required level offault tolerance. The analytical results show that adaptive architectures are able to make the efficient use of available resources without compromising dependability, and moreover, for certain application environments they would respond with lower probability of violating given timing constraints than static architectures.
展开▼