What makes de novo genes so special? Their origin story: it’s rags-to-riches evolutionary tale!

Most new genes are just spin-offs of other genes (e.g. duplication, fusion, etc.). but de novo genes start “from scratch” from previously non-coding sequences.
2/10
Over time, these non-coding sequences can gain new promoters, new ORFs, and potentially become full-fledged proteins with novel functions.

In fact, novo genes have already been discovered in dozens of species including fruit fly 🪰, rice 🌾, and even us humans 🧑‍🤝‍🧑.
3/10
The vast majority of the human genome is non-coding, which again, is the raw material necessary for de novo gene birth.

By contrast, the majority of the baker’s yeast genome is coding (70%). How can young de novo genes get started in a genome with such little raw material?
4/10
As we wanted to study the early stages of de novo gene birth, we couldn’t rely on the reference annotations. We designed an experiment with RNAseq for 11 different species of yeast + ribosome profiling for baker’s yeast, for two conditions:
1) rich media
2) oxidative stress
5/10
The RNAseq data allowed us to study both annotated & unannotated transcripts in baker’s yeast; we then checked if they were conserved in other closely-related species.

We also identified which ORFs were being actively translated using our ribosome profiling data.
6/10
We classified a subset of 213 taxonomically-restricted transcripts as de novo, many of which contain translated ORFs (97/213). These peptides were translated for the first time ever at some point over the last ~20 million years! 🙀
7/10
As has been described in some other studies, we observed that our set of translated de novo peptides were significantly shorter, had lower coding scores, and had higher isoelectric points than conserved transcripts (which are mostly annotated genes).
8/10
Surprisingly, we found that half of de novo transcripts were overlapping another transcript on the other strand (105/213); this configuration is much rarer for more conserved transcripts. Coding sequences with antisense overlap = interesting evolutionary dynamics! 🧐
9/10
There also appeared to be a correlation in the transcriptional regulation of de novo transcripts and their overlapping genes in response to changes in environmental conditions (oxidative stress in our case). This can affect the evolution of de novo genes in compact genomes.
10/10
These discoveries (and many more) are all thanks to my PhD supervisors Mar Alba @maralbasoler and Lucas Carey @LucasBCarey, as well as all of the coauthors, colleagues, and collaborators who made this project possible!
You can follow @willblev.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.