A 10 tweet thread about the argument and solutions in this paper (thread emoji).

As with most twitter threads, this lacks the subtlety in the argument in the paper. It also has significantly more gifs than the paper (no gifs allowed in preprints so far!).

1/10 https://twitter.com/EdArXiv/status/1304470287438639106
You may remember when universities and schools shut down back in March 2020, and lots of research projects had to suddenly stop collecting data.

Schools in every US state closed for what ended up being the rest of the school year.
If you want to estimate how effective an intervention is in a school, but you don’t have a post-test, you are really stuck.

That felt pretty devastating to my education friends.

(GIF: David Tennant as 10 standing in the rain, crying about his missing post-test data)
In this paper, I argue that if you are working with multiple cohorts of students, you can treat the COVID missingness just like any other missing data.

How?

Three things to consider:

4/
First!

Every student in your COVID cohort is missing their post-test. However, you do have (or will have) post-test data from other cohorts.

So the post-test data from _some proportion_ of your study are missing.

5/
But why are they missing? You didn't lose participants because kids moved out of the school district. Teachers didn't quit because your treatment was hard.

COVID shut down everything, so everyone is missing. By definition, the data are Missing at Random.

(Missing at Ransom?)
Second!

The What Works Clearinghouse doesn't have guidelines about pandemics (yet). They do have guidelines about how much data can be missing while still providing strong evidence of effectiveness (Table II.2).

How much can be missing? Up to 50%!

https://ies.ed.gov/ncee/wwc/Docs/referenceresources/wwc_standards_handbook_v4.pdf
Third!

Does it maybe still feel weird that you have a whole cohort of kids missing an outcome? Can you really include them?

Guess what! There are research designs called "planned wave missingness designs" that drop a data collection point _ON PURPOSE_.

8/
Conclusion: If you have a cohort-design, you can use modern missing data methods to impute the post-test data for this COVID cohort in order to get an estimate of your treatment effect.

That's my 10-tweet overview. Have a read, and analyze some data!
You can follow @jarlogan.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.