Nate vs. Leo (and Another Nate): On FiveThirtyEight and The Upshot

Statistics guru Nate Silver has long been considered a master political prognosticator, and for some time he held a virtual monopoly over what has since come to be known as “data journalism.” But around the time that Silver ended his four-year tenure at the New York Times to build an expanded version of his popular FiveThirtyEight blog under the patronage of ESPN, several other players moved into the market: Ezra Klein, formerly of the Washington Post’s Wonkblog, hired “literally everybody” to help him launch, and the Times’ David Leonhardt, creator of the now-defunct Economix blog, succeeded Silver as the paper’s nerd-in-chief with the debut of The Upshot in March 2014.

Much has been written about the strengths and weaknesses of each of the sites, including by our own Chris Fegan a few months back. But there’s one story about the data journalism food fight that has largely slipped under the radar: when it came time to start making forecasts for this fall’s elections for the U.S. Senate, The Upshot and its main politics writer, former New Republic contributor Nate Cohn, somehow managed to completely steal Nate Silver’s thunder.

Part of Silver’s unique appeal during the past few campaign seasons stemmed from the fact that he used statistical models to make quantitative forecasts of the outcomes of presidential and senatorial contests, as opposed to simply offering up the sort of qualitative assessments that are a dime a dozen elsewhere in the world of punditry. For all their weaknesses, these models had some important advantages: they allowed for new data to be quickly incorporated into FiveThirtyEight’s view of a race, and they made it possible to systematically attend to a wider range of variables than a mere human could on his own.

When The Upshot debuted “Leo,” its own model for forecasting the results of the 2014 Senate elections, I initially assumed that it would be a cheap knockoff of Silver’s more refined approach, and that I should really just wait for the new incarnation of the FiveThirtyEight model if I wanted to hear from professionals about what we ought to expect come November.

Leo’s methodology page features a Vox-style Q&A that walks readers through the mechanics of the model. Here’s the response to the first question, which asks about how Leo interprets polls:

We focus on the margin between two major candidates, taking steps to make different polls directly comparable. We tweak polls that count registered voters instead of likely ones. We make further adjustments depending on who conducted the poll.

“Well yes,” I thought, when I read that for the first time, “but they probably don’t make as many adjustments as Nate Silver would, like weighting polls based on their sample size or how recent they are.” Then I scrolled down to the next paragraph:

After adjusting the polls, we take a weighted average for each race, giving more weight to polls with a larger sample size and more recent polls (with a poll’s date being especially important the closer we get to Election Day). We also give more weight to a poll when we are more certain about its pollster’s house effect.

“That’s nice,” I chuckled condescendingly as I kept scrolling, “but I bet Leo doesn’t include any of the other sort of data that Nate Silver would, like candidates’ approval ratings or fundraising totals!” False:

For incumbents running for re-election, we consider their approval ratings. We also consider each candidate’s political experience; money raised; the state’s most recent presidential result; national polls on the public’s mood; and whether the election happens in a midterm or presidential year.

“Alright, this is a little better than I expected,” I said to myself, beginning to furrow my brow, “but Leo probably doesn’t account for the fact that the outcomes of races in different states tend to be correlated, which was something Nate Silver always thought was very important to model.” Also false:

We don’t think the races are independent. If the economy starts booming, it will probably help Democrats everywhere. If President Obama bungles an international crisis, Republicans everywhere could benefit. Even on Election Day, our model assumes the races will be correlated to some extent: The pollsters will tend to miss consistently in one direction or the other across the different races.

I finally realized that Leo was not only quite sophisticated, but that it was virtually identical to the old FiveThirtyEight model. In fact, the methodology page basically admits as much:

Leo owes an intellectual debt to earlier models, including those created by political scientists and especially the FiveThirtyEight model, which popularized ideas about adjusting polls, combining polls with other information and national swings.

FiveThirtyEight has been releasing informal reads on the most competitive Senate races at regular intervals for the past several months. Silver has noted that it is the site’s “tradition” to begin transitioning to algorithmic predictions sometime during the summer. This is indeed what FiveThirtyEight did in 2010, when it began publishing results from its model at the end of August. Yet does one data point make a “tradition”? In 2012, Silver’s model was launched at the beginning of June – right around the same time of year that he made this comment.

One obvious response to those (like myself) who would criticize Silver and his team for letting The Upshot beat them to the punch is that unveiling a quantitative model too early might give a false impression about the precision with which the results of an election can be forecasted many months out. Silver may have been worried that readers would fail to realize just how much uncertainty is associated with early predictions, and would put too much stock in seemingly precise numbers that aren’t really all that informative.

But this is always a danger, and Silver dealt with it in 2012 by posting confidence intervals alongside his forecasts of the popular and electoral votes. Moreover, FiveThirtyEight has argued on multiple occasions that early Senate polls have plenty to tell us about November. Here’s Harry Enten, in a piece from April entitled “Early Senate Polls Have Plenty to Tell Us About November”:

More than six months from the midterm elections, current polling and past precedent are competing for our trust. I analyzed which measure is more indicative come November, and it turns out that polls are a more robust metric even though their numbers are still sparse and there’s still so much time remaining before the election.

It’s not clear what Silver can do at this point to reassert his dominance. Maybe he’ll just try to rely on FiveThirtyEight’s superior name recognition. The site has about three times as many Twitter followers as The Upshot, so it’s possible that the efforts of Leo and Nate Cohn will simply be forgotten in the buzz surrounding the eventual rollout of FiveThirtyEight’s own model. But among hardcore political junkies, I can only assume that Silver’s brand has lost some of its luster. Barring a new model that features some truly innovative bells and whistles, it looks like he allowed himself to be totally outflanked by another guy named Nate.

In Silver’s first post at the new FiveThirtyEight, he explained that “we’ve elected to sacrifice something else as opposed to accuracy or accessibility. The sacrifice is speed – we’re rarely going to be the first organization to break news or to comment on a story.” Fair enough! RM prizes depth over quick turnaround too. (This may or may not be an attempt to offer a noble-sounding excuse for our frequent dry spells.) But it’s not clear that FiveThirtyEight is gaining much of anything by taking its time in rolling out its Senate model. Silver and his colleagues have certainly sacrificed speed, but the upshot is that they seem likely to get nothing in return.



