How should we create knowledge?

We know "The Scientific Method" well: we ask questions then seek answers.

But what about having many answers, and no questions?

(đŸ§”Paper summary + more)
"Here is the evidence, now what is the hypothesis?"

This commentary was written in 2003 by Kell & Oliver, after the Human Genome project was completed.

They saw a proliferation of data in many fields, but also saw resistance on data-driven explorations.
Science was mostly driven by hypotheses.
i.e. we begin by ideas/questions, then look for the data to validate.

Data-driven exploration was sometimes rejected & criticized as “merely a fishing expedition’’.

To understand why, we need to visit logic.
We build knowledge through inference — through cause/effect relations.

There are two (main) types for inference:
- Deductive: it rained (cause) -> grass is wet (effect)
- Inductive: grass is wet + we didn't water -> it rained

Hypothesis is deductive, data-driven is inductive.
There is a reason why scientists prefer deduction:

Induction is 'philosophically insecure', because there could be counter examples that we didn't see.

‘The great tragedy of Science: the slaying of a beautiful hypothesis by an ugly fact’—T.H. Huxley
Hume illustrates this with his turkey example:

You see a turkey being fed everyday at 9am.
You assume that everyday, it’s fed at 9am.
Except one day, it’s not fed, it’s slaughtered.
This is one caveat of inductive science, it is all laden with assumption.

David Wolpert calls this the No Free Lunch (NFL) theorem.
It’s now widely used as an anchor when considering models. NFL means you can induce but please acknowledge your assumptions.
You can hear Wolpert's discussion in @sfiscience’s recent podcast.
https://complexity.simplecast.com/episodes/45 

Also, here’s an excellent essay on how to understand the No Free Lunch Theorem by @amuellerml
https://peekaboo-vision.blogspot.com/2019/07/dont-cite-no-free-lunch-theorem.html
Anyways, we’ve had fruitful attempts at hypothesis-free science (i.e. inductive):

Examples:
- Epidemiology 🔬— many diseases studied through exploration.
- Astronomy 🌌 — Kepler monitored planet orbits, logged the data and then induced a math relationship.
We lose a lot by ignoring inductive methods.

Inductive methods give us the freedom to look at the system as a whole and deduce insights we might not have constructed otherwise.
Example: Molecular vs. Systems Biology on Genes

Molecular biologists would seek gene functions, then find the genes.

Systems biologists would start with all data.

It turned out that the former missed 40% of uncovered genes, because they were geared on functions only.
So much has happened since 2003.

We have computational power that enable us navigate different flavors of big data. We even have the capacity to automate some hypotheses discovery.

Data science evolved and formalized as a proactive arm that unearths insights from big data.
Takeaway message:

We need both hypothesis-driven and data-driven sciences.
They are complementary.

Data-driven science is laden with assumptions,
but it help us view the system as a whole.

[paper: https://cpb-us-w2.wpmucdn.com/sites.gsu.edu/dist/d/2411/files/2017/10/Kell_et_al-2004-BioEssays-1sin81j.pdf]
You can follow @NajlaAlariefy.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.