I don’t often tweet about really technical stuff anymore but I want to geek out a little bit here & dive into unpacking the specific causal question this paper is asking. Indulge me, if you will.
The authors use the synthetic control method & say they are estimating the average treatment effect of Trump rallies on COVID cases & deaths at the county level.

This is a nice example of economists using a method they are the experts in to answer an important COVID question.
My one major quibble with this paper is that I do not think the authors are actually estimating the average treatment effect, and if they are, I don’t think they should be.

Let me explain.
The average treatment effect could be written out in plain language as:

The amount of COVID cases that would have been recorded if *all* counties had had Trump rallies compared to the amount of COVID cases that would have been recorded if *no* counties had had Trump rallies.
This is a somewhat silly thing to estimate because there was never going to be a situation where *all* counties had Trump rallies.

Plus we don’t actually care what might have happened to COVID if counties that really *did not* have a rally had, contrary to fact, held a rally.
Plus, the average treatment effect in the treated is likely much more of interest in this scenario because it tells us specifically about what could have happened to the rally counties.
What is the average treatment effect in the treated? In simple language:

The amount of COVID cases recorded in the counties which *had* Trump rallies compared to the amount of COVID cases that *those counties* would have had if they had not had Trump rallies.
Written like this, it’s easier to see that the key assumption of the paper is:

the amount of COVID in counties that did have rallies would have been equal to the amount of COVID in counties that didn’t have rallies, if those first counties had, contrary to fact, not had rallies.
In practical terms, this means that the counties with no rallies that they picked to represent what might have happened to the counties with rallies had to be good matches for the rally counties in terms of all the things that determine COVID (except the rallies themselves).
This is a tough thing to be sure about, but one of the things I like about this paper is that they use form of the negative control method to partially test this assumption. To do this, they look at the 10 weeks *before the rally* and redo the analysis over that time.
In theory, if the matched counties with no rallies are good proxies for the counties with rallies, they should get pretty similar estimates between what actually happened in the rally counties *before* the rallies & what the model from matched counties say *should have* happened.
This is exactly what they found and that’s a big part of why I think that, while this paper doesn’t model all the complexities of disease transmission, it does do a good job of answering an important causal question.
Bottom line:

How many fewer COVID cases & deaths would counties that had Trump campaign rallies have had if they hadn’t actually held those rallies?

The answer, according to this paper, is a LOT.
You can follow @EpiEllie.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.