Reference:

Beng-Hong Lim and Anant Agarwal. Waiting Algorithms for Synchronization in Large-Scale Multiprocessors. ACM Transactions on Computer Systems, pages 253-294, August 1993. Also available as MIT VLSI Memo 91-632.
(pdf, compressed postscript)

Abstract:

Through analysis and experiments, this paper investigates two-phase waiting algorithms to minimize the cost of waiting for synchronization in large-scale multiprocessors. In a two-phase algorithm, a thread first waits by polling a synchronization variable. If the cost of polling reaches a limit Lpoll and further waiting is necessary, the thread is blocked, incurring an additional fixed cost, B. The choice of Lpoll is a critical determinant of the performance of two-phase algorithms. We focus on methods for statically determining Lpoll because the run-time overhead of dynamically determining Lpoll can be comparable to the cost of blocking in large-scale multiprocessor systems with lightweight threads.

Our experiments show that always-block (Lpoll=0) is a good waiting algorithm with performance that is usually close to the best of the algorithms compared. We show that even better performance can be achieved with a static choice of Lpoll based on knowledge of likely wait-time distributions. Motivated by the observation that different synchronization types exhibit different wait-time distributions, we prove that a static choice of Lpoll can yield close to optimal on-line performance against an adversary that is restricted to choosing wait times from a fixed family of probability distributions. This result allows us to make an optimal static choice of Lpoll based on synchronization type. For exponentially distributed wait times, we prove that setting Lpoll=ln(e-1)B results in a waiting cost that is no more than e/(e-1) times the cost of an optimal off-line algorithm. For uniformly distributed wait times, we prove that setting Lpoll = 1/2(sqrt(5) - 1)B results in a waiting cost that is no more than (sqrt(5) + 1)/2 (the golden ratio) times the cost of an optimal off-line algorithm. Experimental measurements of several parallel applications on the Alewife multiprocessor simulator corroborate our theoretical findings.


Back to Alewife Publications.
Back to Alewife.

webmaster@cag.lcs.mit.edu $Date: 1998/01/06 16:49:48 $