Finding Latent Code Errors Via Machine Learning Over Program Executions
Yuriy Brun

This talk will propose a technique that identifies program properties that may indicate errors. The technique generates machine learning models of run-time program properties known to expose faults, and applies these models to program properties of user-written code to classify and rank properties that may lead the user to errors.

This talk will evaluate an implementation, the Fault Invariant Classifier, that demonstrates the efficacy of the error finding technique. The implementation uses dynamic invariant detection to generate program properties and support vector machine and decision tree learning tools to classify those properties. Given a set of properties produced by the program analysis, some of which are indicative of errors, the technique selects a subset of properties that likely contains a property that reveals an error. In our experiments, The technique increases the relevance (the concentration of properties that reveal errors) by a factor of 50 on average. Furthermore, the technique can rank properties according to likelihood of indicating an error, and the first error-indicating property typically appears early in the list. On average, a user must examine only the 7.8 highest-ranked properties to find a fault-revealing property.

Back to the Programming Systems Graduate Zeminar.

Last updated: Wed Jun 25 12:15:37 EDT 2003