Elizabeth Borowsky, Eli Gafni, Nancy Lynch, and Sergio Rajsbaum. BG Distributed Simulation Algorithm. Technical Memo MIT/LCS/TM-573, Laboratory for Computer Science, 545 Technology Square, Cambridge, MA 02139, December 1997. Also, submitted for journal publication. .pdf.
Abstract
A snapshot shared memory algorithm is presented, allowing a set of f+1 processes, any f of which may exhibit stopping failures, to ``simulate'' a larger number n of processes, also with at most ffailures.
One application of this simulation algorithm is to convert an arbitrary k-fault-tolerant n-process solution for the k-set-agreement problem into a wait-free k+1-process solution for the same problem. Since the k+1-process k-set-agreement problem has been shown to have no wait-free solution [1,2,3], this transformation implies that there is no k-fault-tolerant solution to the n-process k-set-agreement problem, for any n.
More generally, the algorithm satisfies the requirements of a fault-tolerant distributed simulation. The distributed simulation implements a notion of fault-tolerant reducibility between decision problems. These notions are defined, and examples of their use are provided.
The algorithm is presented and verified in terms of I/O automata. The presentation has a great deal of interesting modularity, expressed by I/O automaton composition and both forward and backward simulation relations. Composition is used to include a safe agreement module as a subroutine. Forward and backward simulation relations are used to view the algorithm as implementing a multi-try snapshot strategy.
The main algorithm works in snapshot shared memory systems; a simple modification of the algorithm that works in read/write shared memory systems is also presented.
[1] E. Borowsky and E. Gafni. Generalized FLP impossibility result for $t$-resilient asynchronous computations. Proceedings of the 1993 ACM Symposium on Theory of Computing}, May 1993, 91--100.
[2] M.P. Herlihy and N. Shavit. The asynchronous computability theorem for $t$-resilient tasks. Proceedings of the 1993 ACM Symposium on Theory of Computing}, May 1993, 111--120.
[3] M. Saks and F. Zaharoglou. Wait-free $k$-set agreement is impossible: The topology of public knowledge. Proceedings of the 1993 ACM Symposium on Theory of Computing}, May 1993, 101--110.