1/ Thread alert

I have so many different thoughts about @farid_avari's thoughtful thread on the meaninglessness of our standardized effect sizes. I'll try to thread them in response. @pdakean @siminevazire @syeducation @dingding_peng https://twitter.com/farid_anvari/status/1332745933860298754
2/ First, a few thoughts in defense of standardized effect sizes. Standardized effect sizes are really useful for study design. As noted, if the true effect sizes of birth order or ego depletion are closer to correlations of .04 or so, that's incredibly useful to know.
3/ I can now design my study to be maximally informative by using closer to 3000 participants.

Standardized effects also help me to understand how reliable all of the prior research is given the average sample size for those studies was well below 100.
4/ That is to say, most prior research on ego depletion is at best uninformative, and at worst represents the small tail of results that found large enough effect sizes to be statistically significant in small sample studies--i.e., biased.
5/ Using standardized effect sizes also lets you know the relative importance of a variable on the standardized metric.
6/ Given the gross commonalities of our research--the ubiquitous use of self-report, 5-point Likert scales--there is some useful information in making that comparison (e.g., conscientiousness is a better predictor of mortality than extraversion).
7/ Finally, standardized effect sizes can be informative given the implicit theories that guide our research.
8/ Birth order, parenting, and ego depletion are all thought to have effect sizes big enough to warrant our disproportionate attention (e.g., hundreds of studies) and to be easily detected (e.g., small N research).
9/ Guess what? Those theories are wrong and we know that because of the small standardized effect sizes of those variables in well designed research.
10/ Of course, @farid_anvari correctly criticizes the standardized metric because it can hide or gloss over a lot of information that can have profound effects on the estimated effect size.
11/ For example, if you are studying something that has a low base rate in your sample or study design, then you may inadvertently make a huge effect look small.
12/ All of you folks who use cross-lag panel designs with low-base rates of change in your variables of choice have quite possibly mislead everyone with your null or small effect findings, including yourselves.
13/ Also, we really don't know the practical meaning of our standardized metrics. What does a correlation of .05 really mean?
14/ When our outcomes are measured on 5-point Likert scales with the standard deviations of .6 to .8, I hate to break it to you, but you are barely moving the dial on the outcome. Even correlations of .3 fail to move a population a half a rating scale point.
15/ Of course it is a lot easier to understand the meaning of a standardized effect if you, well, unstandardize it, and use a meaningful metric.
16/ We did that in one study and found that a beta weight of .05 predicted 3 months of educational attainment. As a father of college students, I assure you, one semester which can translate into finishing a degree and 30K is huge.
17/ Also, the meaning of a small effect size may depend on your level of intervention. I disparaged the effect of socioeconomic status on personality (r of about .05) only to be schooled by an economist who showed me how that effect was policy relevant if you were intervening...
18/ ...at the level of a city, state, or country. Just because your guild might think poorly of an r of .05 don't assume others won't find that an actionable effect size.
19/ So, why don't we have more studies that use interpretable metrics? It's because we don't value field or applied research. Period.
20/ I mean, the information is out there, right now, sitting in countless publicly available longitudinal studies that contain objective outcomes--educational attainment, mortality, income, relationship history, etc.
21/ An intrepid researcher could easily do the research to provide the translation table we would all need to better understand what .25 of a standard deviation change on a measure means on any of our favorite variables. Get to it. I'm sure JPSP will publish that (...sarcasm).
22/ And, finally on the last point about interactions, cumulative small effects, & mediation, they are even less important than small main effects. They are canards, mythical creatures whose invocation bring us comfort lying awake in the middle of the night.
23/ If they existed, they would have been reported. Interactions don't happen often and when they do, their effect is smaller than the main effects. Small effects that accumulate? Sorry, the modal effect size decreases with time.
24/ Mediation? Do the math. .3 times .3 and you get .09. It only makes things smaller. Don't do the math with .03 times 03. It gets too depressing.
25/ So, yes, relying solely on standardized metrics is not ideal. But it is still provides some useful information, especially in fields that use common methods. Of course, nothing stops us from doing the work with real metrics which invites the question of why we don't.
You can follow @BrentWRoberts.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.