Thread: Two sets of preregistered multisite replications of the same original study can disagree with each other. A puzzling subplot of Many Labs 5 (ML5) that suggests some truth to “hidden moderators” accounts.

https://osf.io/sy4b9/ 
1.) In 2012, @mcfrank and I led an RPP replication that failed. Original study (Risen & Gilovich): People think “tempting fate” (e.g., skipping the assigned reading for class) causes karmic bad outcomes (professor calls on you to answer questions about the reading you shirked).
2.) Original authors quite reasonably pointed out that using this particular classroom scenario might not work as well on MTurk (as we did for RPP) as it did for them, at Cornell. What if MTurkers are distracted or just don’t care about being cold-called in class?
3.) For ML5, we (incl. @CharlieEbersole, @BalazsAczel, @mh_bernstein, @bradywiggins, @mcxfrank, @gidin) did improved replications: in person, with undergrads at universities, some similar to Cornell. Once again, effect clearly failed to replicate. Familiar story, but…
4.) …PLOT TWIST!! Unbeknownst to us, another Many Labs project (ML2) by @raklein3 simultaneously replicated the *same* original study; protocol closely resembled our improved one. For them, it *worked*! The effect replicated, though with a smaller effect size than original.
5.) I dug hard into the materials and analysis differences between ours and ML2’s. @raklein3 was great. I reanalyzed and pooled subject-level data using identical methods. I compared our respective MTurk samples. I subsetted sites to make highly comparable sampling frames.
6.) Protocol differences were minimal and I could not adjudicate the discrepancies. Table of mysteries:
7.) That two preregistered, high-powered multisite replications of the same original study can disagree seems to me extremely important. Barring data collection errors that affected all our sites but none of ML2’s, it’s hard to find explanations other than…hidden moderators.
8.) Take-home #1: Yes, the overall ML5 results suggest that “known” moderators (e.g., those suspected by original authors) don’t usually account for RPP failures. This is clearly important. But it does *not* follow that hidden moderators don’t exist. In fact, they probably do.
9.) Take home #2: The tempting fate effect is probably “real” in some situations, but we don’t know what those situations are. Conceptual rather that direct replications would help. We replicators must increase our attention to both measured moderators and residual heterogeneity.
10.) N.B.: Original authors Jane Risen and Tom Gilovich were very forthcoming and helpful throughout. Risen and I had wonderful exchanges of puzzlement and speculation throughout.
You can follow @_MMathur.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.