Google Flu Trends uses Google’s query logs to detect breakouts of the flu in the United States, sooner than the Centers for Disease Control can. Apparently flu-related search queries (like “flu symptoms”) are strongly correlated with actual flu rates, and the spike in search queries can be detected quicker than the spike in doctor visits (which is the data that the CDC relies on). The New York Times has a good graphic comparing CDC data against Google’s data for the last few years — including a surprising spike back in October 2003 when 8% of doctor visits were flu-related. Presumably because the vaccine didn’t cover the right set of viruses that year.
Google may not know everything, but they sure know a lot. This may also be cited as an example of “collective intelligence” — the masses (in this case unwittingly) knowing something that’s costly for an institution to learn, and demonstrating what they know by their collective behavior. The Times article ends with a nice quote from Tom Malone.
Sadly, the y axis of the graph in Flu Trends is unlabeled. What are these data points? Queries per day? Are incidence rates in Florida comparable with incidence rates in Massachusetts? If I want to move to a less flu-ridden climate, where should I go? It’s not obvious from the UI.
- Rob Miller