Calls for open science are a common response to concerns about the replication crisis. But can open science actually address the causes of the crisis? + https://twitter.com/mslapointe/status/1355944863959838722
The comic presents 9 factors as contributing to the replication crisis. +
It also presents 5 responses: open data, preregistration, policy changes by universities and funders, and pressure from senior scientists on institutions to make these changes. +
Let's look at the 9 factors. Is open data and/or preregistration likely to help mitigate this factor? +
1. "[Scientists] cut corners": This first one's ambiguous. (A) Maybe they mean typos or other bugs in the analysis code? Open *code* would help with this, though open *data* and preregistration won't. +
(B) It might also include data mismanagement, where bad practices corrupt the data. Open data could help here, letting independent researchers inspect the data for signs indicating corruption. +
Let's say both, so that both open data and open code count towards this one.
2. "Rush their studies": Hard to see how open code and preregistration would check whether authors have read the papers they cite +
3. "Overlook inconvenient results": Open data would help independent researchers discover unreported filtering/trimming; open code would do this more efficiently. Preregistration won't help here. +
4. "Massage their statistics": Also called p-hacking. Preregistration would help here. Open data wouldn't. Open code might, but only if p-hacking researchers kept in all of the unreported analyses. +
And why would you do that if you knew the code was going to be published? +
5. "Hype up the importance of their findings": Nope, open science and preregistration won't help here +
6. Fraud: Open data might make it easier to figure out that something is weird in the data, which might be due to either fraud or mismanagement. (IMO mismanagement is probably much more common than fraud.) +
But clever fraudsters would simulate data, the same way methodologists check whether methods have the statistical properties we think they do. Way easier than conducting pretty much any empirical study. +
So pretty quickly I think open data won't be useful for detecting fraud. +
I don't see how preregistration would help with fraud detection. You can preregister an analysis that will use fraudulent data. +
7. Self-citation and 9. Citation rings: Nope. Open data and preregistration have nothing to do with citations in the paper. +
8. "Plagiarising their own papers to publish them again in different journals": Again, open data and preregistration won't help +
Turning these around, let's see which factors are covered by each remedy: +
Open data: 2/9 + 1 temporarily: #1 (data mismanagement), #3 (unreported filtering), #6 (fraud) temporarily +
Open code: 2/9: #1 (coding errors), #3 (unreported filtering) +
Preregistration: 1/9: #4 (p-hacking) +
Putting all of these together, these practices would address 3/9 + 1 of the factors contributing to the replication crisis +
So they're likely to be helpful. But not a panacea. +
And these methods have notable downsides +
It's hard and time-consuming to do open data and code the way it needs to be done in order for other people to be able to reanalyze your data and run your code +
In fields like epidemiology, where researchers work with sensitive data like medical records, it would be unethical (and illegal) to publish the data +
Preregistration is only really appropriate for experimental science, and a lot of scientific fields aren't based on experiments +
Better responses to the replication crisis, I think, would address the publish-or-perish incentive structure that's behind all 9 factors
You can follow @danieljhicks.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.