I picked this because: (1) It's very recent. (2) the question is very important (3) masks have risen IMO to unsupported level as a fix-all in magical thinking. We didn't do everything else we needed to, so we're clinging to a weak treatment as a cultural symbol to feel good.
(4) From the abstract this looks like it's a time series cross section analysis of covid cases which is in my wheelhouse and I feel comfortable talking about how hard it is to do well and how badly we've been doing it recently
The main takeaway of that paper and so my starting prior is reduction in adjusted risk from about 17.4% to 3.1%, with big uncertainty. In a clinical setting, with well fit well worn masks, might cut risk substantially from really bad to just bad. So do it but don't count on it.
Anecdotally from friends in an infectious disease department, n95s protect you, lesser masks protect everyone if everyone is wearing them, lesser masks might actually increase your risk if you're the only one wearing them because they collect other's sprays. So weak priors.
The universe is only 3 time series, New York, all of Italy, and just Wuhan, China.
The unit of analysis is the location-day
DV_1 is confirmed cases and deaths
From Wuhan Municipal Health Comission, NYC Gov, and then the European CDC
Their argument for only considering 3 cases is that they're epicenters, going from Wuhan, to Italy, to New York. That's already not a good start. You want variation on treatment and outcomes so you can establish relationships. This is selecting on the dependent variable.
IV_1 Face Covering
Date of government mandate for face covering. April 6th norther italy, April 17 in NYC. And supposedly every kind of intervention happened simultaneously some unknown time in January in Wuhan.
IV_2 Social Distancing
Government mandate. For Wuhan, again sometime in January along with everything else.
March 9 in Italy for italy
Multiple orders in NyC March 16, March 19, April, March 22?
They're confusing about the mapping of government orders to treatments to dates.
Their plots don't clear up things. What?
For starters, let's talk about measurement. Testing increased throughout these periods. So the # of confirmed cases isn't a good proxy for actual cases. A lot of that exponential increase in the beginning is finding cases that already existed.
https://ourworldindata.org/coronavirus-data-explorer?zoomToSelection=true&testsMetric=true&dailyFreq=true&aligned=true&smoothing=7&country=USA~GBR~CAN~BRA~AUS~IND~DEU~FRA~MEX~CHL~ZAF~DZA~COL&pickerMetric=location&pickerSort=asc
Next, we have actual measurements of mobility thanks to google, apple, Safegraph, etc. We should never be checking some intent to treat variable like a government order or protests without actually seeing what its real world effect was downstream.
Next we should think through identification here. There is no identification strategy in the paper, they're literally just going to check the slope of covid growth from before and after an intervention and call it a day.
The introduction of that intervention was not the only thing that changed between those two periods. Covid news itself drives people's behaviors and a lot of people reduce mobility prior to the introduction of a government restriction. They also increased it prior to the lift.
There's also a negative feedback loop in how many cases there are and how scared people are to go outside. Growth can be self limiting, just naturally, as it spikes, scares people, slows down again, people relax, spikes again, etc. ratcheting up over and over.
So spoiler alert, no matter what they find, this research design would not provide strong evidence that would effect our beliefs either way. This kind of fitting linear trend lines exercise can't possibly tell you the thing you want to know
Even if we had identification, people vastly over estimate what can be learned from highly auto-correlated time series data. We can demonstrate that with a small placebo test. Here's my pull of confirmed cases for NYC by day from the Bing repo.
If we imagined a placebo treatment at each date along this line, and compared the slope of confirmed cases before that point and after that point, this is the estimated "treatment" effect we find. First things that make covid worse, then things that make covid better.
If we were counting on statistical significance to save us, it wouldn't. 9 (18%) of our placebo treatment dates are statistically significant at 0.01, 13 (25%) at 0.05, and nearly half 22 (44%) at 0.1.
I think i'll stop there. This is incredibly weak evidence. It doesn't tell us masks work and social distancing doesn't. It shouldn't be presenting itself as it's a strong test of those ideas.
Since it's such mediocre work, it definitely shouldn't conclude by dunking on policy makers. "sound science ...should constitute the prime foundation in decision-making" Anyone who cares about sending strong reliable expert signals needs to earn it by doing careful work
I've discovered that others reached the same conclusion about that quality of this paper and are asking for a full retraction https://twitter.com/DanLarremore/status/1273657439305383936
And here's an animated sensitivity analysis show how your conclusions could flip if you vary the badly coded treatment date by just a few days either way https://roadtolarissa.com/regression-discontinuity/
Wow, nice comparison of experts who actually read the paper pointing out how bad it is and a set of experts I'm going to generously say didn't read the paper but were just super excited to declare its findings as true to influence people
https://www.sciencemediacentre.org/expert-reaction-to-a-study-looking-at-mandatory-face-masks-and-number-of-covid-19-infections-in-new-york-wuhan-and-italy/
You can follow @RexDouglass.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.