How Nate Silver Predicted Obama's Win
Throughout last year's election, Nate Silver ran a fascinating website called FiveThirtyEight, named after the number of electors in the United States electoral college. Silver is a statistician, and he spent every day applying his considerable math chops to the various problems of the election -- namely, who would win, where, and by how much. By aggregating and analyzing publicly available poll data, Silver developed a series of statistical models that predicted, with stunning accuracy, what would happen on election day.
Silver got his start as a baseball statistician before he started FiveThirtyEight. A New York Magazine profile from last fall explains:
...At his day job, Silver works for Baseball Prospectus, a loosely organized think tank that, in the last ten years, has revolutionized the interpretation of baseball stats. Furthermore, Silver himself invented a system called PECOTA, an algorithm for predicting future performance by baseball players and teams. (It stands for "player empirical comparison and optimization test algorithm," but is named, with a wink, after the mediocre Kansas City Royals infielder Bill Pecota.) Baseball Prospectus has a reputation in sports-media circles for being unfailingly rigorous, occasionally arrogant, and almost always correct.
When Silver turned his attention to politics, his baseball stats experience came in handy. Silver's models are predictive, meaning that his intent was to figure out what would happen on a certain date (November 4) based on the trend and accuracy of all the available data. This is a little different from what most political pollsters do -- their typical task is to take a snapshot of the electorate on a given day, then analyze that. From the New York article:
As pollster John Zogby put it to me, "We take snapshots. And when you take many snapshots in a row, you get motion pictures."
Similar to how a sports prediction seeks to predict a larger outcome (like which team will win the pennant), based on achieving a series of sub-goals (teams winning various games and playoffs), Silver saw the presidential and congressional elections as a series of smaller games in each state (primaries) that would select the players (candidates), who would eventually compete in the World Series (the election). He had tons of data coming out each day from existing pollsters. He just needed a methodology to use that data, and pull good predictions from it. The New York profile continues:
So he came up with a system that predicts a pollster's future performance based on how good it's been in the past. In finding his average, Silver weights each poll differently—ranking them according to his own statistic, PIE (pollster-introduced error)—based on a number of factors, including its track record and its methodology. One advantage of this system is that, during the primaries, the system actually got smarter. Because each time a poll performed well in a primary, its ranking improved.
For the general election, this gets trickier, since you have polls coming out every single day and you can't know which ones are getting it right until Election Day. You can, however, weigh these new polls based on the pollster's history, the poll's sample size, and how recently the poll was conducted. You can also track trends over time and use these trend lines to forecast where things will end up on November 4. You can also, as Silver has done, analyze all the presidential polling data back to 1952, looking for information as to what is likely to happen next. (For example, how much the polls are likely to tighten in the last month of the race, which they traditionally do.) You can also run 10,000 computer simulations of the election every day based on your poll projections. (Think of this as sort of like that scene at the end of WarGames, where the computer blurs through every possible nuclear-war scenario.) As of October 8, the day after the town-hall debate, Silver's simulations had Obama winning the election 90 percent of the time.
In the end, Silver got it right. His models predicted the popular vote within 1 percentage point as well as predicting 49 of 50 state races correctly. After the election, Silver has continued to analyze special topics, like the Coleman/Franken recount, and his work there has been entertaining as well as enlightening (see his post on real Minnesota ballots entitled Brett Favre Beats Lizard People).
If you're interested in politics, statistics, or even baseball, I think you'll enjoy the New York profile of Silver, which came out before the election was decided. For a bit of after-the-fact analysis, check out this New York Times article from November 9. In any case, Nate Silver is a kind of nerd hero for me: a man whose statistical superpowers brought him to national prominence. Let's hear it for math!