Synchronization and Checkpointing.

Calls to bsp_sync() not only indicate barrier synchronizations, but also mark potential checkpoint locations where a virtual process may be paused, saved, and moved to a different physical processor. While pausing and moving processes is easily done as described in Sect. 3.1, resuming execution is harder. In C, resuming execution can be done relatively easily by either using system-specific low-level mechanisms (as done in [13]), or using a pre-compiler to annotate the source code with a jump table using switch and goto statements [21]. Unfortunately, these techniques are impossible in Java, where there are no explicit goto's - not even at the virtual machine level.

Thus, instead of using direct jumps, we use if statements to skip over already-executed code. Figures 4 and 5 show an example. In general, the code for a superstep should be enclosed in an if (bsp_step()) block and ended with a call to bsp_sync(), as shown in step 0. As a syntactic shortcut, an if (bsp_sync()) statement can be used to end a superstep and start the next one at the same time, as shown at the end of step 1. In cases where function calls, for, while, if-else, or other special constructs prevent a superstep's code from being enclosed in one block, one can use an if (bsp_cont()) statement to continue the superstep on the other side of the offending statement as shown.

**Figure 4:** BSP pseudo-code without code-skipping.
$\begin{figure} \begin{footnotesize} \begin{verbatim} void bsp_run() step 2+n... ...d in next col. ... } // end foo();\end{verbatim} \end{footnotesize} \end{figure}$

**Figure 5:** Transformed BSP pseudo-code with code skipping.
$\begin{figure} \begin{footnotesize} \begin{verbatim} void bsp_run() if ( bsp... ...) // code continued in next col.\end{verbatim} \end{footnotesize} \end{figure}$

Figure 6 shows how these methods are implemented. When restarting a checkpointed BSPWork object, bsp_run() is called with curstep initialized to 0, and restorestep set to the step to be resumed. The bsp_step() and bsp_cont() functions, called at the beginning of each code block, return false while curstep has not yet reached restorestep, allowing blocks that have already been executed to be skipped. At the end of each skipped superstep, the call to bsp_sync() increments curstep so that curstep eventually reaches restorestep, and the desired superstep's block is allowed to execute. The next call to bsp_sync() then ends the superstep by updating restorestep and throwing an exception which causes bsp_run() to return control to the work engine.

**Figure 6:** Code for `bsp_sync()`, `bsp_step()`, `bsp_cont()`.
$\begin{figure} \begin{footnotesize} \begin{verbatim} protected boolean bsp_s... ..._step(); { return bsp_step(); } }\end{verbatim} \end{footnotesize} \end{figure}$

As demonstrated in Fig. 5, these methods can be used even when bsp_sync() calls occur in loops, if-else blocks, and function bodies. Moreover, the code transformation rules are simple enough to be applied manually without using pre-compilers. In many cases, all a programmer needs to do is enclose the superstep code in braces, and add an if in front of bsp_sync(). As shown here, and more clearly in the realistic sample code in Sect. 4.2, the resulting code is still reasonably readable. In fact, the indented blocks may even help emphasize the divisions between supersteps. This simplicity offers a significant advantage in convenience over other programming interfaces that require language changes and sophisticated pre-compilers, since it does not require programmers to generate and keep track of extra files, and allows them to more easily use off-the-shelf integrated development and build environments. Furthermore, if more readability is desired, one can implement a parallel language with nothing more than a C-style preprocessor by using #define macros to replace if (bsp_step()) with parbegin, bsp_sync(); with parend, and if (bsp_cont()) with parcont.

As far as we know, the only significant disadvantage of this code-skipping technique is that resuming execution inside or after an unbounded or very long loop may take unnecessary extra time because the skipping has to go through all previous iterations of the loop before reaching the current superstep. We are currently working on a solution to this problem, which may take the form of a special version of bsp_cont() which updates loop variables and curstep without having to iterate through the loop.