Thread by @momeara, I'm thrilled to announce @teresa_omeara and I have a new preprint: DeORFanizing Candida albicans [...]

I'm thrilled to announce @teresa_omeara and I have a new preprint: DeORFanizing Candida albicans Genes using Co-Expression https://www.biorxiv.org/content/10.1101/2020.12.04.412718v1

DeORFanizing Candida albicans Genes using Co-Expression

Functional characterization of open reading frames in non-model organisms, such as the common opportunistic fungal pathogen Candida albicans, can be labor intensive. To meet this challenge, we built...

https://www.biorxiv.org/content/10.1101/2020.12.04.412718v1

@teresa_omeara and I have collaborated on 5 papers since grad school, but it was really fun to work on this as our first two-author paper.

We asked if there was enough RNAseq data to build a useful co-expression network for Candida albicans for gene function prediction. It turned out to work amazingly well!

We're working on getting the Candida Albicans Co-Expression Network (CalCEN) out to the community perhaps through FungiDB. Meanwhile, send us your favorite gene and we'll analyze it for you :D

Several years ago I made a network between proteins based on their ligand similarity. @JesseAGillis and @SaraBallouz taught us how to do guilt-by-association gene function prediction and many of the pitfalls to watch out for https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0160098

The expansion of protein-ligand annotation databases has enabled large-scale networking of proteins by ligand similarity. These ligand-based protein networks, which implicitly predict the ability of...

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0160098

For example, there are few multi-functional genes like p53, HSP90, etc that tend to be involved in lots of functions. If a network simply predicts these for all functions it does pretty well for retrospective gene function prediction. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0017258

The Impact of Multifunctional Genes on "Guilt by Association" Analysis

Many previous studies have shown that by using variants of “guilt-by-association”, gene function predictions can be made with very high statistical confidence. In these studies, it is assumed that...

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0017258

The trouble is, just predicting multi-functional genes isn't very useful for finding new genes for a given function or new functions for a given gene! To test for this bias in a network, simply predicting genes by their network degree for all functions--the Degree Null Predictor.

While in @teresa_omeara was in the @CowenLab she had several really cool functional genomics projects for Candida albicans including screening deletion collection screens and building protein-protein interactions, and I helped with the bioinformatic analysis.

Would Co-expression would complement these studies? There are 18 large scale RNAseq studies in Candida albicans, is this enough to make a useful co-expression network? It looks like ~10 are needed, and the performance hasn't saturated by 18 is plenty but more would help.

Comparing Co-expression to other networks, we see that CalCEN has strong predictive accuracy with very low multifunctionality bias. Further when combined with other networks it adds more signal.

To explore the network, we looked first at known gene clusters. Eg. Histones proteins cluster except for HHT1, consistent with recent findings in https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000422

As a case study we use CalCEN to predict a role for C4_06590W in cell cycle.

Using deep learning de novo structure prediction with TrRosetta, we verify that it has a DnaJ domain that is similar to the solved structure for SIS1 in Sac.

Consistent with a role in cell cycle, depleting it causes filamentation and hypersensitivity to cell cycle inhibitors. So we call it "Cell Cycle DnaJ" (CCJ1):

You can follow @momeara.

Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: