Would you see a gorilla hiding in your data? It depends on what question you’re trying to answer.

**In many cases, a hypothesis can be a liability**

Awesome new paper by @ItaiYanai and @MartinJLercher: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02133-w
You’re probably familiar with the famous “selective attention experiment”. If you aren’t, stop what you’re doing and watch this video - .
Participants are asked to count the number of times a group of people pass a basketball back and forth - and because you’re focused on counting, you miss the man in the gorilla suit that walks through the frame!
The authors designed a similar experiment to test data analysis skills. They made up a dataset comparing “BMI” vs. “step count” for men and women and gave it to their students to analyze.
One group of students was given specific hypotheses to investigate, like “is there a correlation between BMI and daily steps?”, and then they asked if there was anything else notable in the data. A second group of students was just asked what they conclude from the data.
Students who were not presented with a hypothesis to investigate were 5x more likely to discover the gorilla than students who were given specific questions to answer.
The lesson - in science, always always always look at your data! And, more broadly, if you’re just trying to answer specific questions using your data, it’s quite easy to miss the gorilla in the room.
I think that these results - and the title of the paper - are exceptionally important to keep in mind in my own field of cancer research.

I sat on my first NIH study section last month and wow, hypothesis-motivated reasoning was exceptionally common.
Some labs had been studying the same gene, using the same two lentiviral shRNA hairpins, for 10+ years! Claiming some role for their gene in every hot pathway - metabolism, epigenetic regulation, whatever.
But, if you take a step back, and look the gene up on http://www.depmap.org , in some cases unbiased screening shows that loss of that gene has no fitness effect whatsoever.
I’d rather fund a project about a gene demonstrated to be important via unbiased DepMap screening than a gene written about in 10 hypothesis-driven Cell papers that have never been independently validated.
Like the paper says, a hypothesis can be a liability. Once you’ve decided “gene X is important for lung cancer”, you can miss all evidence to the contrary.

Forget your hypothesis, take a step back, and just look at the data!
As we wrote about in a recent @NatureRevGenet article, there are excellent unbiased resources for target discovery in cancer. I wish more people used them!

https://www.nature.com/articles/s41576-020-0247-7.
You can follow @JSheltzer.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.