Thread by @tdverstynen, Okay, while I'm not saying that MRI methods are ready to be [...]

Okay, while I'm not saying that MRI methods are ready to be clinical-level biomarkers yet, I think this study (and everyone else) needs to slow their roll on the conclusions.

PDF: https://journals.sagepub.com/doi/pdf/10.1177/0956797620916786?casa_token=UIPjXcEb-UIAAAAA:cavn9LGRCvgYCaEj7YHuK3rPXuwMLMtlQdYnxbwyb5t9J-h1RSpG93PmJiNgoMt3IzlyaLG7Y19x

THREAD 1/11 https://twitter.com/CT_Bergstrom/status/1277060876386721792

1) The inter-class correlation (ICC) when comparing task-linked BOLD responses across *all* voxels in the brain is 0.397. Since only a subset of voxels are actually relevant (i.e., task related using a statistical test) for given task, then this shouldn't be that surprising. 2/11

Activation in sensory areas will be *highly* unstable across runs/sessions/scanners/centers because they are modulated by subtle variability in inputs (e.g., brightness) and other things like attention. Noise voxels will just be noise.

This will all pull down the ICC.

3/11

In fact, the study reports that the average ICC for when thresholded voxels are used (i.e., only looking at those voxels deemed relevant for the task) increases to 0.705.

So when voxels are engaged in a task, those voxels tend to be pretty reliable!

4/11

2) For the region analysis, the authors use "thresholded" maps (Fig. 4), i.e., t-tests. There are 2 ways a t-test can fail to be "significant": too small a numerator (mean), too large a denominator (variability). We have no idea which is causing the variance in this figure. 5/11

Within a region there might be stable and unstable voxels. This study failed to show voxelwise stability maps & instead base their conclusions largely off of thresholded activation patterns. It's not possible to know if the variability here is uniform and to what degree. 6/11

3) Studies that actually try to build predictive models from task related activity do not weight *all* voxels equally. Some are relevant for predicting individual differences, some aren't. Assuming that low ICC across all voxels means biomarkers are infeasible is naive. 7/11

If fMRI measures were so unreliable as claimed, then we wouldn't be able to predict individual differences using cross validation. But we can!

For example:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5008686/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5634271/
https://www.jneurosci.org/content/31/2/439.short

(review here)
https://www.biologicalpsychiatryjournal.com/article/S0006-3223(20)30111-6/pdf

8/11

Functional connectome fingerprinting: Identifying individuals based on patterns of brain connecti...

While fMRI studies typically collapse data from many subjects, brain functional organization varies between individuals. Here, we establish that this individual variability is both robust and...

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5008686/

In fact, there are many techniques used in our field (e.g., crossnobis estimators, PCM, any model with proper cross validation) that just wouldn't be feasible (but are) if fMRI was so unreliable.

This is all but ignored in the conclusions.

9/11

4) Not all tasks are the same. A lot of experimental design factors can determine when a task is or isn't relevant for predicting individual differences.
(e.g., https://academic.oup.com/scan/article/doi/10.1093/scan/nsaa050/5821247?searchresult=1)

Crappy tasks have low reliability, regardless of whether they're done in an MRI.

10/11

Affective brain patterns as multivariate neural correlates of cardiovascular disease risk

Abstract. This study tested whether brain activity patterns evoked by affective stimuli relate to individual differences in an indicator of pre-clinical athero

https://academic.oup.com/scan/article/doi/10.1093/scan/nsaa050/5821247?searchresult=1

In summary:

Do the authors raise valid concerns that we have to address as a field? Yes.

Do their data invalidate the feasibility of fMRI as a marker of individual differences. Absolutely not.

11/11

Latest Threads Unrolled: