Today: "wow, the polls were pretty far off, maybe we should have taken them with a grain of salt."
Next Tuesday: "Here's Nate Entrails with a deep dive analysis of the latest tracking numbers from the NBS-Squippippinac Approve-O-Meter"
Next Tuesday: "Here's Nate Entrails with a deep dive analysis of the latest tracking numbers from the NBS-Squippippinac Approve-O-Meter"
I guess it's time for my quadrennial rant on sampling statistics.
Sampling statistics have a long and proud history in all kinds of applications, from medical studies to manufacturing to agriculture, and yes, for a while, in politics.
For example, suppose you had a widget factory and wanted to know what percent of the widgets coming off the line are defective. Let's say testing for defects was very expensive, or involved destroying the widget. Obviously you can't test every single widget.
The answer is random sampling, which gives you both an estimate of the % of defects, and a "margin of error" (confidence interval) that decreases asymptotically with the size of your sample. Mathemagically, you don't have to have a very big sample to get a very tight estimate.
Now let's say you do a simple random sample of n=1000 widgets coming off the line, and find that 100 are defective (p=0.10). The standard error of your estimate is
sqrt(p*(1-p) / n) = sqrt(0.9*0.1 / 1000) = 0.0095
sqrt(p*(1-p) / n) = sqrt(0.9*0.1 / 1000) = 0.0095
The 95% confidence interval ("margin of error") on your estimate is +/- two standard deviations (actually 1.96):
2*0.0095 = 0.019
Roughly, there's a 95% chance that your *real* widget defect rate is 10%, plus or minus 1.9%.
2*0.0095 = 0.019
Roughly, there's a 95% chance that your *real* widget defect rate is 10%, plus or minus 1.9%.
If you want to get really precise and cut the margin of error in half, you'd have to quadruple your sample size; margin of error decreases proportionately with the square of n.
This is basically the math used in political polling when you hear the term "margin of error." Most political polls are n=~1000, with a candidate's % around 50%, which if you plug into that formula above correspond to a margin of error of about 3%.
But here's the deal: voters are not *widgets*. You can't randomly pick them off the assembly line and take them to the test bench. These kinda widgets have caller ID, may not be interested in talking to you if you get past that, and even then may be totally fucking with you.
Political pollster are loath to report how many calls they have to make to harvest one usable poll response, but from what I've discerned it's somewhere between 7-15, which naturally leads one to question how weird are these widgets who volunteered to be tested.
This wasn't much of a issue in 1968 or 1976, when people universally had a big avocado landline phone in their kitchen, and would always answer even if they had no idea who was calling.
I mean, pollsters are kind of aware of these issues and know that their raw sample results are demographically skewed. For instance, the sample may be 3% Hispanic males 18-45, where the census says they are 5% of the population; the answer is to shrug and weigh them up by 5/3.
It's one thing to reweigh your sample so it's roughly demographically similar to the underlying population. It's another thing to reweigh your sample to be representative of the vast majority of people who *don't want to be sampled*.
Please note I am not ascribing any nefarious motives or biases to pollsters. Even if they were all 100% committed to accuracy, there's no easy way around this issue.
I will say that reporting a "margin of error" as if the 10-15% of chatty weirdos who wanted to talk to you are representative of the general public is, well, mathematical malpractice.
It's not that the political polls are *wrong* per se, it's that they no longer contain any useful information at all.
I'm sure there will be a few polls now touted as "nailing" the election; keep in mind if you assembled election predictions from 100 astrologers and phrenologists and circus seals, you'd also probably get a few amazingly accurate guesses too.
In short, until there is some way for the political polling biz to prove it has solve its inherent here's-what-the-weirdos-say problem, the only place polls should be reported is on Page 15D, between the Daily Horoscope and Garfield.
Just to be safe though: if you accidentally answer a call from a political pollster, remember it is your sacred duty as an American citizen to lie your ass off.