Fault-tolerance
Stopping faults
- processors crashing or leaving network
- handled by adaptive parallelism
Byzantine faults
- all other problems, intentional and unintentional
Solutions
- replication (e.g., majority voting)
- spot-checking
- saboteurs may get through, but get caught eventually
- problem choice
- easily checked results (e.g., searches w/ rare solns.)
- fault-tolerant problems (e.g., rendering)