|
Finding Latent Code Errors Via Machine Learning Over Program Executions
Yuriy Brun
This talk will propose a technique that identifies program properties
that may indicate errors. The technique generates machine learning
models of run-time program properties known to expose faults, and
applies these models to program properties of user-written code to
classify and rank properties that may lead the user to errors.
This talk will evaluate an implementation, the Fault Invariant
Classifier, that demonstrates the efficacy of the error finding
technique. The implementation uses dynamic invariant detection to
generate program properties and support vector machine and decision
tree learning tools to classify those properties. Given a set of
properties produced by the program analysis, some of which are
indicative of errors, the technique selects a subset of properties
that likely contains a property that reveals an error. In our
experiments, The technique increases the relevance (the concentration
of properties that reveal errors) by a factor of 50 on average.
Furthermore, the technique can rank properties according to likelihood
of indicating an error, and the first error-indicating property
typically appears early in the list. On average, a user must examine
only the 7.8 highest-ranked properties to find a fault-revealing
property.
|