next up previous
Next: Programming interface improvements. Up: Current Issues and Future Previous: Peer-to-peer communication.

Fault-tolerance across supersteps.

At present, our crash-tolerance and fault-tolerance mechanisms only work within a superstep. It is not clear yet how they can be extended if checkpoints are not performed each superstep. In such cases, a crash or fault may force a rollback to the previous checkpoint, possibly wasting work unnecessarily. It would be useful to develop a way to recover from a fault without having to do a full rollback.

Luis Sarmenta