Network intrusion detection systems often rely on matching patterns
that are gleaned from known attacks. While this method is reliable and
rarely produces false alarms, it has the obvious disadvantage that it
cannot detect novel attacks. An alternative approach is to learn a
model of normal traffic and report deviations, but these anomaly
models are typically restricted to modeling IP addresses and ports,
and do not include the application payload where many attacks
occur. We describe a novel approach to anomaly detection. We extract a
set of attributes from each event (IP packet or TCP connection),
including strings in the payload, and induce a set of conditional
rules which have a very low probability of being violated in a
nonstationary model of the normal network traffic in the training
data. In the 1999 DARPA intrusion detection evaluation data set, we
detect about 60% of 190 attacks at a false alarm rate of 10 per day
(100 total). We believe that anomaly detection can work because most
attacks exploit software or configuration errors that escaped field
testing, so are only exposed under unusual consitions.
Though our rule learning techniques are applied to network intrusion
detection, they are general enough for detecting anomalies in other
applications.