Building Blocks for Fault-Tolerant Distributed Systems

One of our current application case studies involves the identification of useful "building blocks" for efficient, fault-tolerant distributed systems. These building blocks will include a wide range of communication and memory abstractions, and also a variety of special-purpose building blocks such as transaction-processing modules, failure detectors, and resource allocators. The specification of the building blocks will include performance and fault-tolerance information as well as ordinary correctness information. Such building blocks should be useful to system designers, because they will permit unambiguous specification of required system and component behavior (including performance and fault-tolerance properties), will support system modification, and will speed up the design process by permitting reuse of software and preventing expensive design errors. Some of our specific projects are:

TOC / LCS / MIT
Last modified: July 15, 1999
Comments?