THREAD: 538’s 2020 election forecast design tweaks solve an important problem from 2016, but 538 has not grappled w larger issues—how they shape the social consensus about the presidential race + what that does to voters and political actors. https://twitter.com/wiederkehra/status/1287872080474603521?s=21
Broad social consensus? Really? Here’s a chart showing the rising prominence of 538’s influence in the news cycle in the 6 months leading up to presidential elections:
Here’s a plot showing how 538’s 2018 real-time house forecast caused a temporary spike in the U.S. bond yield. Yea, 538’s forecast made it more expensive for the world’s superpower to borrow money, at least for a short time.
538's forecasts are rigorous. Unlike conventional horserace coverage covering swings, unusual results, & speculation about paths to victory, they provide a single compact number. That number accounts for both the vote margin and uncertainty in aggregated polling data.
But they acknowledge that their 2016 presentation—a bar conveying that Clinton had a ~72% chance of beating Trump—was a recipe for criticism & potential miscommunication.
Most critically, some people confused the probability of winning with the vote share. Talking Points Memo made this very mistake: https://talkingpointsmemo.com/news/issa-calls-race-early
And @ylelkes, @seanjwestwood & I have seen this play out in our research: around 40% of people confused forecasted win probability & vote share, putting the same number down for both after viewing a probabilistic forecast. https://solomonmg.github.io/pdf/aggregator.pdf
And @statsmodels referenced the fact that a small change in the vote share often corresponds to a massive shift in the probability of victory.
https://andrewgelman.com/2012/10/22/is-it-meaningful-to-talk-about-a-probability-of-65-7-that-obama-will-win-the-election/
https://andrewgelman.com/2012/10/22/is-it-meaningful-to-talk-about-a-probability-of-65-7-that-obama-will-win-the-election/
By moving to ad odds-based metric, they’ve done a lot to address this issue, potentially at the expense of user-engagement—most people DON’T WANT to think carefully about the chances a candidate wins.
If I see 7:8 odds today and 9:16 odds tomorrow, I have to either think carefully about what that means or do a bunch of math to get the two odds on the same denominator.
And that denominator matters a lot—if Biden has a 90% chance of winning, that’s 1:10 odds for Trump. If Biden has a 95% chance of winning, that’s a 1 in 20 chance Trump wins! That’s a massive difference that our brains hide from us because we like to approximate things.
We see probabilities subjectively. And so we play the lottery and live in fear of terrorist attacks but love driving in cars and generally reason badly about the payoffs of low-probability events.
They also provide a better way to account for uncertainty, using “ballswarms” and histograms. That’s good, and the only suggestion I’d have is to give odds in terms of x in 1000 chance to make odds more comparable.
But there are much larger questions here, most of which relate to something about which @NateSilver often expresses frustration—how the media interpret and report on his forecast. https://fivethirtyeight.com/features/the-media-has-a-probability-problem/
How will the media & public interpret these forecasts in 2020? Will they engage more deeply w/ the numbers? Will they see the race as a foregone conclusion or has 2016 inoculated society from the media narrative in 2016?
Just the other day, staff from Cook Report (THEY DO FORECASTING) had trouble putting his finger on just what he thought was wrong w/ @gelliottmorris’s forecast https://twitter.com/redistrict/status/1287384315337347074?s=21
@gelliottmorris does MORE THAN ANYONE to make it easy to replicate what he’s done. I’ve run the code. He’s given the public everything except the ability to run his very latest 2020 predictions (and he was willing to work something out w/ me in the interest of good science).
AND @gelliottmorris’s forecast strike’s me as one of the most rigorous I’ve seen. It relies more on highly aggregated voting returns and economic data, which are more stable than polling alone.
He’s worked with @statsmodeling on the model & uses a smart methodological approach (regularization, optimizing for test error, back-testing, bayesian estimates, all the things).
There ARE legit reasons to wonder if @gelliottmorris’s forecast is too certain. Potential overfitting given the small # of data points. Covid/VBM/suppression introducing unusual biases and uncertainty that we haven’t had in the past & that the model doesn’t (YET) account for.
But not many people understand that! Moreover, most people DON’T WANT to understand the guts of what goes into the bottom line, they just want the bottom line.
How can we expect ordinary people or policy makers or even apparently data journalists to make sense of these models?
The social implications are hard to grapple with, because policymakers are making endogenous decisions based on election forecasts. Exhibit A is the Comey letter. @NateSilver and others have suggested that the letter probably cost Clinton the election. https://fivethirtyeight.com/features/the-comey-letter-probably-cost-clinton-the-election/
But Comey is clear in his book--he only released the letter because he was sure Clinton was going to win and didn’t want to undermine her legitimacy, presumably because of the media narrative, which was driven in large part by election forecasts https://abcnews.go.com/Politics/comey-included-thought-clinton-win-2016-election/story?id=54486869
HA @StatModeling not @statsmodels, sorry Andy